Mercurial > hg > xemacs-beta
comparison man/internals/internals.texi @ 44:8d2a9b52c682 r19-15prefinal
Import from CVS: tag r19-15prefinal
author | cvs |
---|---|
date | Mon, 13 Aug 2007 08:55:10 +0200 |
parents | d620409f5eb8 |
children | ee648375d8d6 |
comparison
equal
deleted
inserted
replaced
43:23cafc5d2038 | 44:8d2a9b52c682 |
---|---|
5 @c %**end of header | 5 @c %**end of header |
6 | 6 |
7 @ifinfo | 7 @ifinfo |
8 | 8 |
9 Copyright @copyright{} 1992 - 1996 Ben Wing. | 9 Copyright @copyright{} 1992 - 1996 Ben Wing. |
10 Copyright @copyright{} 1996 Sun Microsystems. | 10 Copyright @copyright{} 1996, 1997 Sun Microsystems. |
11 Copyright @copyright{} 1994, 1995 Free Software Foundation. | 11 Copyright @copyright{} 1994, 1995 Free Software Foundation. |
12 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. | 12 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. |
13 | 13 |
14 | 14 |
15 Permission is granted to make and distribute verbatim copies of this | 15 Permission is granted to make and distribute verbatim copies of this |
57 @setchapternewpage odd | 57 @setchapternewpage odd |
58 @finalout | 58 @finalout |
59 | 59 |
60 @titlepage | 60 @titlepage |
61 @title XEmacs Internals Manual | 61 @title XEmacs Internals Manual |
62 @subtitle Version 1.0, March 1996 | 62 @subtitle Version 1.1, March 1997 |
63 | 63 |
64 @author Ben Wing | 64 @author Ben Wing |
65 @author Martin Buchholz | |
65 @page | 66 @page |
66 @vskip 0pt plus 1fill | 67 @vskip 0pt plus 1fill |
67 | 68 |
68 @noindent | 69 @noindent |
69 Copyright @copyright{} 1992 - 1996 Ben Wing. @* | 70 Copyright @copyright{} 1992 - 1996 Ben Wing. @* |
70 Copyright @copyright{} 1996 Sun Microsystems, Inc. @* | 71 Copyright @copyright{} 1996 Sun Microsystems, Inc. @* |
71 Copyright @copyright{} 1994 Free Software Foundation. @* | 72 Copyright @copyright{} 1994 Free Software Foundation. @* |
72 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. | 73 Copyright @copyright{} 1994, 1995 Board of Trustees, University of Illinois. |
73 | 74 |
74 @sp 2 | 75 @sp 2 |
75 Version 1.0 @* | 76 Version 1.1 @* |
76 March, 1996.@* | 77 March, 1997.@* |
77 | 78 |
78 Permission is granted to make and distribute verbatim copies of this | 79 Permission is granted to make and distribute verbatim copies of this |
79 manual provided the copyright notice and this permission notice are | 80 manual provided the copyright notice and this permission notice are |
80 preserved on all copies. | 81 preserved on all copies. |
81 | 82 |
868 | 869 |
869 XEmacs also contains a great deal of Lisp code. This implements the | 870 XEmacs also contains a great deal of Lisp code. This implements the |
870 operations that make XEmacs useful as an editor as well as just a | 871 operations that make XEmacs useful as an editor as well as just a |
871 Lisp environment, and also contains many add-on packages that allow | 872 Lisp environment, and also contains many add-on packages that allow |
872 XEmacs to browse directories, act as a mail and Usenet news reader, | 873 XEmacs to browse directories, act as a mail and Usenet news reader, |
873 compile Lisp code, etc. There is actually a lot more Lisp code than | 874 compile Lisp code, etc. There is actually more Lisp code than |
874 C code associated with XEmacs, but much of the Lisp code is | 875 C code associated with XEmacs, but much of the Lisp code is |
875 peripheral to the actual operation of the editor. The Lisp code | 876 peripheral to the actual operation of the editor. The Lisp code |
876 all lies in subdirectories underneath the @file{lisp/} directory. | 877 all lies in subdirectories underneath the @file{lisp/} directory. |
877 | 878 |
878 The @file{lwlib/} directory contains C code that implements a | 879 The @file{lwlib/} directory contains C code that implements a |
887 | 888 |
888 The @file{lib-src/} directory contains C code for various auxiliary | 889 The @file{lib-src/} directory contains C code for various auxiliary |
889 programs that are used in connection with XEmacs. Some of them are used | 890 programs that are used in connection with XEmacs. Some of them are used |
890 during the build process; others are used to perform certain functions | 891 during the build process; others are used to perform certain functions |
891 that cannot conveniently be placed in the XEmacs executable (e.g. the | 892 that cannot conveniently be placed in the XEmacs executable (e.g. the |
892 @file{movemail} program for fetching mail out of /var/spool/mail, which | 893 @file{movemail} program for fetching mail out of @file{/var/spool/mail}, |
893 must be setgid to @file{mail} on many systems; and the 'gnuclient' | 894 which must be setgid to @file{mail} on many systems; and the |
894 program, which allows an external script to communicate with a running | 895 @file{gnuclient} program, which allows an external script to communicate |
895 XEmacs process). | 896 with a running XEmacs process). |
896 | 897 |
897 The @file{man/} directory contains the sources for the XEmacs | 898 The @file{man/} directory contains the sources for the XEmacs |
898 documentation. It is mostly in a form called Texinfo, which can be | 899 documentation. It is mostly in a form called Texinfo, which can be |
899 converted into either a printed document (by passing it through TeX) or | 900 converted into either a printed document (by passing it through @TeX{}) |
900 into on-line documentation called @dfn{info files}. | 901 or into on-line documentation called @dfn{info files}. |
901 | 902 |
902 The @file{info/} directory contains the results of formatting the | 903 The @file{info/} directory contains the results of formatting the |
903 XEmacs documentation as @dfn{info files}, for on-line use. These files | 904 XEmacs documentation as @dfn{info files}, for on-line use. These files |
904 are used when you enter the Info system using @kbd{C-h i} or through the | 905 are used when you enter the Info system using @kbd{C-h i} or through the |
905 Help menu. | 906 Help menu. |
938 windows on the screen, and if you simply run it, it will exit | 939 windows on the screen, and if you simply run it, it will exit |
939 immediately. The Makefile runs @file{temacs} with certain options that | 940 immediately. The Makefile runs @file{temacs} with certain options that |
940 cause it to initialize itself, read in a number of basic Lisp files, and | 941 cause it to initialize itself, read in a number of basic Lisp files, and |
941 then dump itself out into a new executable called @file{xemacs}. This | 942 then dump itself out into a new executable called @file{xemacs}. This |
942 new executable has been pre-initialized and contains pre-digested Lisp | 943 new executable has been pre-initialized and contains pre-digested Lisp |
943 code that is necessary for the editor to function (this includes some | 944 code that is necessary for the editor to function (this includes most |
944 extremely basic Lisp functions, e.g. @code{not}, that can be defined in | 945 basic Lisp functions, e.g. @code{not}, that can be defined in terms of |
945 terms of other Lisp primitives; some initialization code that is called | 946 other Lisp primitives; some initialization code that is called when |
946 when certain objects, such as frames, are created; and all of the | 947 certain objects, such as frames, are created; and all of the standard |
947 standard keybindings and code for the actions they result in). This | 948 keybindings and code for the actions they result in). This executable, |
948 executable, @file{xemacs}, is the executable that you run to use the | 949 @file{xemacs}, is the executable that you run to use the XEmacs editor. |
949 XEmacs editor. | |
950 | 950 |
951 @node XEmacs From the Inside, The XEmacs Object System (Abstractly Speaking), XEmacs From the Perspective of Building, Top | 951 @node XEmacs From the Inside, The XEmacs Object System (Abstractly Speaking), XEmacs From the Perspective of Building, Top |
952 @chapter XEmacs From the Inside | 952 @chapter XEmacs From the Inside |
953 | 953 |
954 Internally, XEmacs is quite complex, and can be very confusing. To | 954 Internally, XEmacs is quite complex, and can be very confusing. To |
955 simplify things, it can be useful to think of XEmacs as containing an | 955 simplify things, it can be useful to think of XEmacs as containing an |
956 event loop that ``drives'' everything, and a number of other subsystems, | 956 event loop that ``drives'' everything, and a number of other subsystems, |
957 such as a Lisp engine and a redisplay mechanism. Each of these others | 957 such as a Lisp engine and a redisplay mechanism. Each of these other |
958 subsystems exists simultaneously in XEmacs, and each has a certain | 958 subsystems exists simultaneously in XEmacs, and each has a certain |
959 state. The flow of control continually passes in and out of these | 959 state. The flow of control continually passes in and out of these |
960 different subsystems in the course of normal operation of the editor. | 960 different subsystems in the course of normal operation of the editor. |
961 | 961 |
962 It is important to keep in mind that, most of the time, the editor is | 962 It is important to keep in mind that, most of the time, the editor is |
984 | 984 |
985 @item | 985 @item |
986 The buffer mechanism is responsible for keeping track of what buffers | 986 The buffer mechanism is responsible for keeping track of what buffers |
987 exist and what text is in them. It is periodically given commands | 987 exist and what text is in them. It is periodically given commands |
988 (usually from the user) to insert or delete text, create a buffer, etc. | 988 (usually from the user) to insert or delete text, create a buffer, etc. |
989 When it receives a textual-change command, it tells the redisplay | 989 When it receives a text-change command, it notifies the redisplay |
990 mechanism about this. | 990 mechanism. |
991 | 991 |
992 @item | 992 @item |
993 The redisplay mechanism is responsible for making sure that windows and | 993 The redisplay mechanism is responsible for making sure that windows and |
994 frames are displayed correctly. It is periodically told (by the event | 994 frames are displayed correctly. It is periodically told (by the event |
995 loop) to actually ``do its job'', i.e. snoop around and see what the | 995 loop) to actually ``do its job'', i.e. snoop around and see what the |
1181 these types of objects.) | 1181 these types of objects.) |
1182 | 1182 |
1183 XEmacs Lisp also contains numerous specialized objects used to | 1183 XEmacs Lisp also contains numerous specialized objects used to |
1184 implement the editor: | 1184 implement the editor: |
1185 | 1185 |
1186 @table @asis | 1186 @table @code |
1187 @item buffer | 1187 @item buffer |
1188 Stores text like a string, but is optimized for insertion and deletion | 1188 Stores text like a string, but is optimized for insertion and deletion |
1189 and has certain other properties that can be set. | 1189 and has certain other properties that can be set. |
1190 @item frame | 1190 @item frame |
1191 An object with various properties whose displayable representation is a | 1191 An object with various properties whose displayable representation is a |
1230 An object that describes a connection to an externally-running process. | 1230 An object that describes a connection to an externally-running process. |
1231 @end table | 1231 @end table |
1232 | 1232 |
1233 There are some other, less-commonly-encountered general objects: | 1233 There are some other, less-commonly-encountered general objects: |
1234 | 1234 |
1235 @table @asis | 1235 @table @code |
1236 @item hashtable | 1236 @item hashtable |
1237 An object that maps from an arbitrary Lisp object to another arbitrary | 1237 An object that maps from an arbitrary Lisp object to another arbitrary |
1238 Lisp object, using hashing for fast lookup. | 1238 Lisp object, using hashing for fast lookup. |
1239 @item obarray | 1239 @item obarray |
1240 A limited form of hashtable that maps from strings to symbols; obarrays | 1240 A limited form of hashtable that maps from strings to symbols; obarrays |
1254 An object that maps from ranges of integers to arbitrary Lisp objects. | 1254 An object that maps from ranges of integers to arbitrary Lisp objects. |
1255 @end table | 1255 @end table |
1256 | 1256 |
1257 And some strange special-purpose objects: | 1257 And some strange special-purpose objects: |
1258 | 1258 |
1259 @table @asis | 1259 @table @code |
1260 @item charset | 1260 @item charset |
1261 @itemx coding-system | 1261 @itemx coding-system |
1262 Objects used when MULE, or multi-lingual/Asian-language, support is | 1262 Objects used when MULE, or multi-lingual/Asian-language, support is |
1263 enabled. | 1263 enabled. |
1264 @item color-instance | 1264 @item color-instance |
1368 @example | 1368 @example |
1369 ?^[$(B#&^[(B | 1369 ?^[$(B#&^[(B |
1370 @end example | 1370 @end example |
1371 | 1371 |
1372 (where @samp{^[} actually is an @samp{ESC} character) converts to a | 1372 (where @samp{^[} actually is an @samp{ESC} character) converts to a |
1373 particular Kanji character. (To decode this gook: @samp{ESC} begins an | 1373 particular Kanji character when using an ISO2022-based coding system for |
1374 escape sequence; @samp{ESC $ (} is a class of escape sequences meaning | 1374 input. (To decode this gook: @samp{ESC} begins an escape sequence; |
1375 ``switch to a 94x94 character set''; @samp{ESC $ ( B} means ``switch to | 1375 @samp{ESC $ (} is a class of escape sequences meaning ``switch to a |
1376 Japanese Kanji''; @samp{#} and @samp{&} collectively index into a | 1376 94x94 character set''; @samp{ESC $ ( B} means ``switch to Japanese |
1377 94-by-94 array of characters [subtract 33 from the ASCII value of each | 1377 Kanji''; @samp{#} and @samp{&} collectively index into a 94-by-94 array |
1378 character to get the corresponding index]; @samp{ESC (} is a class of | 1378 of characters [subtract 33 from the ASCII value of each character to get |
1379 escape sequences meaning ``switch to a 94 character set''; @samp{ESC (B} | 1379 the corresponding index]; @samp{ESC (} is a class of escape sequences |
1380 means ``switch to US ASCII''. It is a coincidence that the letter | 1380 meaning ``switch to a 94 character set''; @samp{ESC (B} means ``switch |
1381 @samp{B} is used to denote both Japanese Kanji and US ASCII. If the | 1381 to US ASCII''. It is a coincidence that the letter @samp{B} is used to |
1382 first @samp{B} were replaced with an @samp{A}, you'd be requesting a | 1382 denote both Japanese Kanji and US ASCII. If the first @samp{B} were |
1383 Chinese Hanzi character from the GB2312 character set.) | 1383 replaced with an @samp{A}, you'd be requesting a Chinese Hanzi character |
1384 from the GB2312 character set.) | |
1384 | 1385 |
1385 @example | 1386 @example |
1386 "foobar" | 1387 "foobar" |
1387 @end example | 1388 @end example |
1388 | 1389 |
1511 opposite semantics? ``Hysterical reasons'', of course.) | 1512 opposite semantics? ``Hysterical reasons'', of course.) |
1512 | 1513 |
1513 @cindex record type | 1514 @cindex record type |
1514 Note that there are only eight types that the tag can represent, | 1515 Note that there are only eight types that the tag can represent, |
1515 but many more actual types than this. This is handled by having | 1516 but many more actual types than this. This is handled by having |
1516 one of the tag types specify a meta-object called a @dfn{record}; | 1517 one of the tag types specify a meta-type called a @dfn{record}; |
1517 for all such objects, the first four bytes of the pointed-to | 1518 for all such objects, the first four bytes of the pointed-to |
1518 structure indicate what the actual type is. | 1519 structure indicate what the actual type is. |
1519 | 1520 |
1520 Note also that having 28 bits for pointers and integers restricts a | 1521 Note also that having 28 bits for pointers and integers restricts a |
1521 lot of things to 256 megabytes of memory. (Basically, enough pointers | 1522 lot of things to 256 megabytes of memory. (Basically, enough pointers |
1535 (e.g. beginning at 0x80000000). Those machines cope by defining | 1536 (e.g. beginning at 0x80000000). Those machines cope by defining |
1536 @code{DATA_SEG_BITS} in the corresponding @file{m/} or @file{s/} file to | 1537 @code{DATA_SEG_BITS} in the corresponding @file{m/} or @file{s/} file to |
1537 the proper mask. Then, pointers retrieved from Lisp objects are | 1538 the proper mask. Then, pointers retrieved from Lisp objects are |
1538 automatically OR'ed with this value prior to being used. | 1539 automatically OR'ed with this value prior to being used. |
1539 | 1540 |
1540 A corollary of the previous paragraph is that @strong{stack-allocated | 1541 A corollary of the previous paragraph is that @strong{(pointers to) |
1541 structures cannot be put into Lisp objects}. The stack is generally | 1542 stack-allocated structures cannot be put into Lisp objects}. The stack |
1542 located near the top of memory; if you put such a pointer into a Lisp | 1543 is generally located near the top of memory; if you put such a pointer |
1543 object, it will get its top bits chopped off, and you will lose. | 1544 into a Lisp object, it will get its top bits chopped off, and you will |
1545 lose. | |
1544 | 1546 |
1545 Various macros are used to construct Lisp objects and extract the | 1547 Various macros are used to construct Lisp objects and extract the |
1546 components. Macros of the form @code{XINT()}, @code{XCHAR()}, | 1548 components. Macros of the form @code{XINT()}, @code{XCHAR()}, |
1547 @code{XSTRING()}, @code{XSYMBOL()}, etc. mask out the pointer/integer | 1549 @code{XSTRING()}, @code{XSYMBOL()}, etc. mask out the pointer/integer |
1548 field and cast it to the appropriate type. All of the macros that | 1550 field and cast it to the appropriate type. All of the macros that |
1563 object is really of the correct type. This is great for catching places | 1565 object is really of the correct type. This is great for catching places |
1564 where an incorrect type is being dereferenced -- this typically results | 1566 where an incorrect type is being dereferenced -- this typically results |
1565 in a pointer being dereferenced as the wrong type of structure, with | 1567 in a pointer being dereferenced as the wrong type of structure, with |
1566 unpredictable (and sometimes not easily traceable) results. | 1568 unpredictable (and sometimes not easily traceable) results. |
1567 | 1569 |
1568 There are similar @code{XSET()} macros that construct a Lisp object. | 1570 There are similar @code{XSET@var{TYPE}()} macros that construct a Lisp object. |
1569 These macros are of the form @code{XSET (@var{lvalue}, @var{result})}, | 1571 These macros are of the form @code{XSET@var{TYPE} (@var{lvalue}, @var{result})}, |
1570 i.e. they have to be a statement rather than just used in an expression. | 1572 i.e. they have to be a statement rather than just used in an expression. |
1571 The reason for this is that standard C doesn't let you ``construct'' a | 1573 The reason for this is that standard C doesn't let you ``construct'' a |
1572 structure (but GCC does). Granted, this sometimes isn't too convenient; | 1574 structure (but GCC does). Granted, this sometimes isn't too convenient; |
1573 for the case of integers, at least, you can use the function | 1575 for the case of integers, at least, you can use the function |
1574 @code{make_number()}, which constructs and @emph{returns} an integer | 1576 @code{make_number()}, which constructs and @emph{returns} an integer |
1575 Lisp object. Note that the @code{XSET()} macros are also affected by | 1577 Lisp object. Note that the @code{XSET@var{TYPE}()} macros are also |
1576 @code{ERROR_CHECK_TYPECHECK} and make sure that the structure is of the right | 1578 affected by @code{ERROR_CHECK_TYPECHECK} and make sure that the |
1577 type in the case of record types, where the type is contained in | 1579 structure is of the right type in the case of record types, where the |
1578 the structure. | 1580 type is contained in the structure. |
1579 | 1581 |
1580 @node Rules When Writing New C Code, A Summary of the Various XEmacs Modules, How Lisp Objects Are Represented in C, Top | 1582 @node Rules When Writing New C Code, A Summary of the Various XEmacs Modules, How Lisp Objects Are Represented in C, Top |
1581 @chapter Rules When Writing New C Code | 1583 @chapter Rules When Writing New C Code |
1582 | 1584 |
1583 The XEmacs C Code is extremely complex and intricate, and there are | 1585 The XEmacs C Code is extremely complex and intricate, and there are |
1601 @code{vars_of_*()} function. The former declares any Lisp primitives | 1603 @code{vars_of_*()} function. The former declares any Lisp primitives |
1602 you have defined and defines any symbols you will be using. The latter | 1604 you have defined and defines any symbols you will be using. The latter |
1603 declares any global Lisp variables you have added and initializes global | 1605 declares any global Lisp variables you have added and initializes global |
1604 C variables in the module. For each such function, declare it in | 1606 C variables in the module. For each such function, declare it in |
1605 @file{symsinit.h} and make sure it's called in the appropriate place in | 1607 @file{symsinit.h} and make sure it's called in the appropriate place in |
1606 @code{main()}. @strong{Important}: There are stringent requirements on | 1608 @file{emacs.c}. @strong{Important}: There are stringent requirements on |
1607 exactly what can go into these functions. See the comment in | 1609 exactly what can go into these functions. See the comment in |
1608 @code{main()}. The reason for this is to avoid obscure unwanted | 1610 @file{emacs.c}. The reason for this is to avoid obscure unwanted |
1609 interactions during initialization. If you don't follow these rules, | 1611 interactions during initialization. If you don't follow these rules, |
1610 you'll be sorry! If you want to do anything that isn't allowed, create | 1612 you'll be sorry! If you want to do anything that isn't allowed, create |
1611 a @code{complex_vars_of_*()} function for it. Doing this is tricky, | 1613 a @code{complex_vars_of_*()} function for it. Doing this is tricky, |
1612 though: You have to make sure your function is called at the right time | 1614 though: You have to make sure your function is called at the right time |
1613 so that all the initialization dependencies work out. | 1615 so that all the initialization dependencies work out. |
1614 | 1616 |
1615 Every module includes @file{<config.h>} (angle brackets so that | 1617 Every module includes @file{<config.h>} (angle brackets so that |
1616 @samp{--srcdir} works correctly) and @file{lisp.h}. @file{config.h} | 1618 @samp{--srcdir} works correctly; @file{config.h} may or may not be in |
1619 the same directory as the C sources) and @file{lisp.h}. @file{config.h} | |
1617 should always be included before any other header files (including | 1620 should always be included before any other header files (including |
1618 system header files) to ensure that certain tricks played by various | 1621 system header files) to ensure that certain tricks played by various |
1619 @file{s/} and @file{m/} files work out correctly. | 1622 @file{s/} and @file{m/} files work out correctly. |
1620 | 1623 |
1621 @strong{All global and static variables that are to be modifiable must | 1624 @strong{All global and static variables that are to be modifiable must |
1670 appearance.) | 1673 appearance.) |
1671 | 1674 |
1672 @cindex garbage collection protection | 1675 @cindex garbage collection protection |
1673 @smallexample | 1676 @smallexample |
1674 @group | 1677 @group |
1675 DEFUN ("or", For, Sor, 0, UNEVALLED, 0 /* | 1678 DEFUN ("or", For, 0, UNEVALLED, 0, /* |
1676 Eval args until one of them yields non-nil, then return that value. | 1679 Eval args until one of them yields non-nil, then return that value. |
1677 The remaining args are not evalled at all. | 1680 The remaining args are not evalled at all. |
1678 @end group | |
1679 @group | |
1680 If all args return nil, return nil. | 1681 If all args return nil, return nil. |
1681 */ ) | 1682 */ |
1682 (args) | 1683 (args)) |
1683 Lisp_Object args; | |
1684 @{ | 1684 @{ |
1685 /* This function can GC */ | 1685 /* This function can GC */ |
1686 REGISTER Lisp_Object val; | 1686 REGISTER Lisp_Object val; |
1687 Lisp_Object args_left; | 1687 Lisp_Object args_left; |
1688 struct gcpro gcpro1; | 1688 struct gcpro gcpro1; |
1689 @end group | 1689 |
1690 | |
1691 @group | |
1692 if (NILP (args)) | 1690 if (NILP (args)) |
1693 return Qnil; | 1691 return Qnil; |
1694 | 1692 |
1695 args_left = args; | 1693 args_left = args; |
1696 GCPRO1 (args_left); | 1694 GCPRO1 (args_left); |
1697 @end group | 1695 |
1698 | |
1699 @group | |
1700 do | 1696 do |
1701 @{ | 1697 @{ |
1702 val = Feval (Fcar (args_left)); | 1698 val = Feval (Fcar (args_left)); |
1703 if (!NILP (val)) | 1699 if (!NILP (val)) |
1704 break; | 1700 break; |
1705 args_left = Fcdr (args_left); | 1701 args_left = Fcdr (args_left); |
1706 @} | 1702 @} |
1707 while (!NILP (args_left)); | 1703 while (!NILP (args_left)); |
1708 @end group | 1704 |
1709 | |
1710 @group | |
1711 UNGCPRO; | 1705 UNGCPRO; |
1712 return val; | 1706 return val; |
1713 @} | 1707 @} |
1714 @end group | 1708 @end group |
1715 @end smallexample | 1709 @end smallexample |
1716 | 1710 |
1717 Let's start with a precise explanation of the arguments to the | 1711 Let's start with a precise explanation of the arguments to the |
1718 @code{DEFUN} macro. Here is a template for them: | 1712 @code{DEFUN} macro. Here is a template for them: |
1719 | 1713 |
1720 @example | 1714 @example |
1721 DEFUN (@var{lname}, @var{fname}, @var{sname}, @var{min}, @var{max}, @var{interactive} /* @var{doc} */ ) | 1715 DEFUN (@var{lname}, @var{fname}, @var{min}, @var{max}, @var{interactive}, /* |
1716 @var{docstring} | |
1717 */ | |
1718 (@var{arglist}) ) | |
1722 @end example | 1719 @end example |
1723 | 1720 |
1724 @table @var | 1721 @table @var |
1725 @item lname | 1722 @item lname |
1726 This is the name of the Lisp symbol to define as the function name; in | 1723 This string is the name of the Lisp symbol to define as the function |
1727 the example above, it is @code{or}. | 1724 name; in the example above, it is @code{"or"}. |
1728 | 1725 |
1729 @item fname | 1726 @item fname |
1730 This is the C function name for this function. This is | 1727 This is the C function name for this function. This is the name that is |
1731 the name that is used in C code for calling the function. The name is, | 1728 used in C code for calling the function. The name is, by convention, |
1732 by convention, @samp{F} prepended to the Lisp name, with all dashes | 1729 @samp{F} prepended to the Lisp name, with all dashes (@samp{-}) in the |
1733 (@samp{-}) in the Lisp name changed to underscores. Thus, to call this | 1730 Lisp name changed to underscores. Thus, to call this function from C |
1734 function from C code, call @code{For}. Remember that the arguments must | 1731 code, call @code{For}. Remember that the arguments are of type |
1735 be of type @code{Lisp_Object}; various macros and functions for creating | 1732 @code{Lisp_Object}; various macros and functions for creating values of |
1736 values of type @code{Lisp_Object} are declared in the file | 1733 type @code{Lisp_Object} are declared in the file @file{lisp.h}. |
1737 @file{lisp.h}. | |
1738 | 1734 |
1739 Primitives whose names are special characters (e.g. @code{+} or | 1735 Primitives whose names are special characters (e.g. @code{+} or |
1740 @code{<}) are named by spelling out, in some fashion, the special | 1736 @code{<}) are named by spelling out, in some fashion, the special |
1741 character: e.g. @code{Fplus()} or @code{Flss()}. Primitives whose names | 1737 character: e.g. @code{Fplus()} or @code{Flss()}. Primitives whose names |
1742 begin with normal alphanumeric characters but also contain special | 1738 begin with normal alphanumeric characters but also contain special |
1743 characters are spelled out in some creative way, e.g. @code{let*} | 1739 characters are spelled out in some creative way, e.g. @code{let*} |
1744 becomes @code{FletX()}. | 1740 becomes @code{FletX()}. |
1745 | 1741 |
1746 @item sname | 1742 Each function also has an associated structure that holds the data for |
1747 This is a C variable name to use for a structure that holds the data for | |
1748 the subr object that represents the function in Lisp. This structure | 1743 the subr object that represents the function in Lisp. This structure |
1749 conveys the Lisp symbol name to the initialization routine that will | 1744 conveys the Lisp symbol name to the initialization routine that will |
1750 create the symbol and store the subr object as its definition. By | 1745 create the symbol and store the subr object as its definition. The C |
1751 convention, this name is always @var{fname} with @samp{F} replaced with | 1746 variable name of this structure is always @samp{S} prepended to the |
1752 @samp{S}. | 1747 @var{fname}. You hardly ever need to be aware of the existence of this |
1748 structure. | |
1753 | 1749 |
1754 @item min | 1750 @item min |
1755 This is the minimum number of arguments that the function requires. The | 1751 This is the minimum number of arguments that the function requires. The |
1756 function @code{or} allows a minimum of zero arguments. | 1752 function @code{or} allows a minimum of zero arguments. |
1757 | 1753 |
1760 there is a fixed maximum. Alternatively, it can be @code{UNEVALLED}, | 1756 there is a fixed maximum. Alternatively, it can be @code{UNEVALLED}, |
1761 indicating a special form that receives unevaluated arguments, or | 1757 indicating a special form that receives unevaluated arguments, or |
1762 @code{MANY}, indicating an unlimited number of evaluated arguments (the | 1758 @code{MANY}, indicating an unlimited number of evaluated arguments (the |
1763 equivalent of @code{&rest}). Both @code{UNEVALLED} and @code{MANY} are | 1759 equivalent of @code{&rest}). Both @code{UNEVALLED} and @code{MANY} are |
1764 macros. If @var{max} is a number, it may not be less than @var{min} and | 1760 macros. If @var{max} is a number, it may not be less than @var{min} and |
1765 it may not be greater than 12. (If you need to add a function with | 1761 it may not be greater than 8. (If you need to add a function with |
1766 more than 12 arguments, either use the @code{MANY} form or edit the | 1762 more than 8 arguments, either use the @code{MANY} form or edit the |
1767 definition of @code{DEFUN} in @file{lisp.h}. If you do the latter, | 1763 definition of @code{DEFUN} in @file{lisp.h}. If you do the latter, |
1768 make sure to also add another clause to the switch statement in | 1764 make sure to also add another clause to the switch statement in |
1769 @code{primitive_funcall().}) | 1765 @code{primitive_funcall().}) |
1770 | 1766 |
1771 @item interactive | 1767 @item interactive |
1773 the argument of @code{interactive} in a Lisp function. In the case of | 1769 the argument of @code{interactive} in a Lisp function. In the case of |
1774 @code{or}, it is 0 (a null pointer), indicating that @code{or} cannot be | 1770 @code{or}, it is 0 (a null pointer), indicating that @code{or} cannot be |
1775 called interactively. A value of @code{""} indicates a function that | 1771 called interactively. A value of @code{""} indicates a function that |
1776 should receive no arguments when called interactively. | 1772 should receive no arguments when called interactively. |
1777 | 1773 |
1778 @item doc | 1774 @item docstring |
1779 This is the documentation string. It is written just like a | 1775 This is the documentation string. It is written just like a |
1780 documentation string for a function defined in Lisp; in particular, | 1776 documentation string for a function defined in Lisp; in particular, the |
1781 the first line should be a single sentence. Note how the documentation | 1777 first line should be a single sentence. Note how the documentation |
1782 string is enclosed in a comment, none of the documentation is placed | 1778 string is enclosed in a comment, none of the documentation is placed on |
1783 on the same lines as the comment-start and comment-end characters, and | 1779 the same lines as the comment-start and comment-end characters, and the |
1784 the comment-start characters are on the same line as the interactive | 1780 comment-start characters are on the same line as the interactive |
1785 specification. @file{make-docfile}, which scans the C files for | 1781 specification. @file{make-docfile}, which scans the C files for |
1786 documentation strings, is very particular about what it looks for, | 1782 documentation strings, is very particular about what it looks for, and |
1787 and will not properly note the doc string if it's not in this exact | 1783 will not properly extract the doc string if it's not in this exact format. |
1788 format. | 1784 |
1789 @end table | 1785 You are free to put the various arguments to @code{DEFUN} on separate |
1790 | |
1791 You are free to put the various arguments to @code{DEFUN} on separate | |
1792 lines to avoid overly long lines. However, make sure to put the | 1786 lines to avoid overly long lines. However, make sure to put the |
1793 comment-start characters for the doc string on the same line as the | 1787 comment-start characters for the doc string on the same line as the |
1794 interactive specification, and put a newline directly after them | 1788 interactive specification, and put a newline directly after them (and |
1795 (and before the comment-end characters). | 1789 before the comment-end characters). |
1796 | 1790 |
1797 After the call to the @code{DEFUN} macro, you must write the argument | 1791 @item arglist |
1798 name list that every C function must have, followed by ordinary C | 1792 This is the comma-separated list of arguments to the C function. For a |
1799 declarations for the arguments. For a function with a fixed maximum | 1793 function with a fixed maximum number of arguments, provide a C argument |
1800 number of arguments, declare a C argument for each Lisp argument, and | 1794 for each Lisp argument. In this case, unlike regular C functions, the |
1801 give them all type @code{Lisp_Object}. When a Lisp function has no | 1795 types of the arguments are not declared; they are simply always of type |
1802 upper limit on the number of arguments, its implementation in C actually | 1796 @code{Lisp_Object}. |
1803 receives exactly two arguments: the first is the number of Lisp | 1797 |
1804 arguments, and the second is the address of a block containing their | 1798 The names of the C arguments will be used as the names of the arguments |
1805 values. They have types @code{int} and @w{@code{Lisp_Object *}}. | 1799 to the Lisp primitive as displayed in its documentation, modulo the same |
1806 | 1800 concerns described above for @code{F...} names (in particular, |
1807 The names of the C arguments will be used as the names of the arguments | |
1808 to the Lisp primitive as displayed in its documentation, modulo the | |
1809 same concerns described above for @code{F...} names (in particular, | |
1810 underscores in the C arguments become dashes in the Lisp arguments). | 1801 underscores in the C arguments become dashes in the Lisp arguments). |
1811 There is one additional kludge: A C argument called @code{defalt} | 1802 There is one additional kludge: A C argument called @code{defalt} |
1812 becomes the Lisp argument @code{default}. This deliberate misspelling | 1803 becomes the Lisp argument @code{default}. This deliberate misspelling |
1813 is done because @code{default} is a reserved word in the C language. | 1804 is done because @code{default} is a reserved word in the C language. |
1814 | 1805 |
1815 Note that you @emph{must} use old-style prototypes for the arguments | 1806 A Lisp function with @w{@var{max} = @code{UNEVALLED}} is a |
1816 to @code{DEFUN}, even though all other functions in the C code use | 1807 @w{@dfn{special form}}; its arguments are not evaluated. Instead it |
1817 new-style prototypes. | 1808 receives one argument of type @code{Lisp_Object}, a (Lisp) list of the |
1809 unevaluated arguments, conventionally named @code{(args)}. | |
1810 | |
1811 When a Lisp function has no upper limit on the number of arguments, | |
1812 specify @w{@var{max} = @code{MANY}}. In this case its implementation in | |
1813 C actually receives exactly two arguments: the number of Lisp arguments | |
1814 (an @code{int}) and the address of a block containing their values (a | |
1815 @w{@code{Lisp_Object *}}). In this case only are the C types specified | |
1816 in the @var{arglist}: @w{@code{(int nargs, Lisp_Object *args)}}. | |
1817 | |
1818 @end table | |
1818 | 1819 |
1819 Within the function @code{For} itself, note the use of the macros | 1820 Within the function @code{For} itself, note the use of the macros |
1820 @code{GCPRO1} and @code{UNGCPRO}. @code{GCPRO1} is used to ``protect'' | 1821 @code{GCPRO1} and @code{UNGCPRO}. @code{GCPRO1} is used to ``protect'' |
1821 a variable from garbage collection---to inform the garbage collector | 1822 a variable from garbage collection---to inform the garbage collector |
1822 that it must look in that variable and regard its contents as an | 1823 that it must look in that variable and regard its contents as an |
1849 of XEmacs coding. It is @strong{extremely} important that you get this | 1850 of XEmacs coding. It is @strong{extremely} important that you get this |
1850 right and use a great deal of discipline when writing this code. | 1851 right and use a great deal of discipline when writing this code. |
1851 @xref{GCPROing, ,@code{GCPRO}ing}, for full details on how to do this. | 1852 @xref{GCPROing, ,@code{GCPRO}ing}, for full details on how to do this. |
1852 | 1853 |
1853 What @code{DEFUN} actually does is declare a global structure of | 1854 What @code{DEFUN} actually does is declare a global structure of |
1854 type @code{Lisp_Subr} whose name begins with a capital @samp{S} and | 1855 type @code{Lisp_Subr} whose name begins with capital @samp{SF} and |
1855 which contains information about the primitive (e.g. a pointer to the | 1856 which contains information about the primitive (e.g. a pointer to the |
1856 function, its minimum and maximum allowed arguments, a string describing | 1857 function, its minimum and maximum allowed arguments, a string describing |
1857 its Lisp name); @code{DEFUN} then begins a normal C function | 1858 its Lisp name); @code{DEFUN} then begins a normal C function |
1858 declaration using the @code{F...} name. The Lisp subr object that is | 1859 declaration using the @code{F...} name. The Lisp subr object that is |
1859 the function definition of a primitive (i.e. the object in the function | 1860 the function definition of a primitive (i.e. the object in the function |
1860 slot of the symbol that names the primitive) actually points to this | 1861 slot of the symbol that names the primitive) actually points to this |
1861 @samp{S} structure; when @code{Feval} encounters a subr, it looks in the | 1862 @samp{SF} structure; when @code{Feval} encounters a subr, it looks in the |
1862 structure to find out how to call the C function. | 1863 structure to find out how to call the C function. |
1863 | 1864 |
1864 Defining the C function is not enough to make a Lisp primitive | 1865 Defining the C function is not enough to make a Lisp primitive |
1865 available; you must also create the Lisp symbol for the primitive (the | 1866 available; you must also create the Lisp symbol for the primitive (the |
1866 symbol is @dfn{interned}; @pxref{Obarrays}) and store a suitable subr | 1867 symbol is @dfn{interned}; @pxref{Obarrays}) and store a suitable subr |
1867 object in its function cell. (If you don't do this, the primitive won't | 1868 object in its function cell. (If you don't do this, the primitive won't |
1868 be seen by Lisp code.) The code looks like this: | 1869 be seen by Lisp code.) The code looks like this: |
1869 | 1870 |
1870 @example | 1871 @example |
1871 defsubr (&@var{subr-structure-name}); | 1872 DEFSUBR (@var{fname}); |
1872 @end example | 1873 @end example |
1873 | 1874 |
1874 @noindent | 1875 @noindent |
1875 Here @var{subr-structure-name} is the name you used as the third | 1876 Here @var{fname} is the name you used as the second argument to |
1876 argument to @code{DEFUN}. | 1877 @code{DEFUN}. |
1877 | 1878 |
1878 This call to @code{defsubr} should go in the @code{syms_of_*()} | 1879 This call to @code{DEFSUBR} should go in the @code{syms_of_*()} |
1879 function at the end of the module. If no such function exists, create | 1880 function at the end of the module. If no such function exists, create |
1880 it and make sure to also declare it in @file{symsinit.h} and call it | 1881 it and make sure to also declare it in @file{symsinit.h} and call it |
1881 from the appropriate spot in @code{main()}. @xref{General Coding | 1882 from the appropriate spot in @code{main()}. @xref{General Coding |
1882 Rules}. | 1883 Rules}. |
1883 | 1884 |
1884 Note that C code cannot call functions by name unless they are defined | 1885 Note that C code cannot call functions by name unless they are defined |
1885 in C. The way to call a function written in Lisp is to use | 1886 in C. The way to call a function written in Lisp from C is to use |
1886 @code{Ffuncall}, which embodies the Lisp function @code{funcall}. Since | 1887 @code{Ffuncall}, which embodies the Lisp function @code{funcall}. Since |
1887 the Lisp function @code{funcall} accepts an unlimited number of | 1888 the Lisp function @code{funcall} accepts an unlimited number of |
1888 arguments, in C it takes two: the number of Lisp-level arguments, and a | 1889 arguments, in C it takes two: the number of Lisp-level arguments, and a |
1889 one-dimensional array containing their values. The first Lisp-level | 1890 one-dimensional array containing their values. The first Lisp-level |
1890 argument is the Lisp function to call, and the rest are the arguments to | 1891 argument is the Lisp function to call, and the rest are the arguments to |
2106 @file{alloca.c}, etc. are normally placed past @file{lastfile.c}, and | 2107 @file{alloca.c}, etc. are normally placed past @file{lastfile.c}, and |
2107 all of the files that implement Xt widget classes @emph{must} be placed | 2108 all of the files that implement Xt widget classes @emph{must} be placed |
2108 after @file{lastfile.c} because they contain various structures that | 2109 after @file{lastfile.c} because they contain various structures that |
2109 must be statically initialized and into which Xt writes at various | 2110 must be statically initialized and into which Xt writes at various |
2110 times.) @file{pre-crt0.c} and @file{lastfile.c} contain exported symbols | 2111 times.) @file{pre-crt0.c} and @file{lastfile.c} contain exported symbols |
2111 that are used to determine the start and end of XEmacs's initialized | 2112 that are used to determine the start and end of XEmacs' initialized |
2112 data space when dumping. | 2113 data space when dumping. |
2113 | 2114 |
2114 | 2115 |
2115 | 2116 |
2116 @example | 2117 @example |
3090 43058 chartab.c | 3091 43058 chartab.c |
3091 6503 chartab.h | 3092 6503 chartab.h |
3092 9918 casetab.c | 3093 9918 casetab.c |
3093 @end example | 3094 @end example |
3094 | 3095 |
3095 @file{chartab.c} and @file{chartab.h} implement the char table Lisp | 3096 @file{chartab.c} and @file{chartab.h} implement the @dfn{char table} |
3096 object type, which maps from characters or certain sorts of character | 3097 Lisp object type, which maps from characters or certain sorts of |
3097 ranges to Lisp objects. The implementation of this object is optimized | 3098 character ranges to Lisp objects. The implementation of this object |
3098 for the internal representation of characters. Char tables come in | 3099 type is optimized for the internal representation of characters. Char |
3099 different types, which affect the allowed object types to which a | 3100 tables come in different types, which affect the allowed object types to |
3100 character can be mapped and also dictate certain other properties of the | 3101 which a character can be mapped and also dictate certain other |
3101 char table. | 3102 properties of the char table. |
3102 | 3103 |
3103 @cindex case table | 3104 @cindex case table |
3104 @file{casetab.c} implements one sort of char table, the @dfn{case | 3105 @file{casetab.c} implements one sort of char table, the @dfn{case |
3105 table}, which maps characters to other characters of possibly different | 3106 table}, which maps characters to other characters of possibly different |
3106 case. These are used by XEmacs to implement case-changing primitives | 3107 case. These are used by XEmacs to implement case-changing primitives |
3112 49593 syntax.c | 3113 49593 syntax.c |
3113 10200 syntax.h | 3114 10200 syntax.h |
3114 @end example | 3115 @end example |
3115 | 3116 |
3116 @cindex scanner | 3117 @cindex scanner |
3117 This module implements syntax tables, another sort of char table that | 3118 This module implements @dfn{syntax tables}, another sort of char table |
3118 maps characters into syntax classes that define the syntax of these | 3119 that maps characters into syntax classes that define the syntax of these |
3119 characters (e.g. a parenthesis belongs to a class of @samp{open} characters | 3120 characters (e.g. a parenthesis belongs to a class of @samp{open} |
3120 that have corresponding @samp{close} characters and can be nested). | 3121 characters that have corresponding @samp{close} characters and can be |
3121 This module also implements the Lisp @dfn{scanner}, a set of primitives | 3122 nested). This module also implements the Lisp @dfn{scanner}, a set of |
3122 for scanning over text based on syntax tables. This is used, for | 3123 primitives for scanning over text based on syntax tables. This is used, |
3123 example, to find the matching parenthesis in a command such as | 3124 for example, to find the matching parenthesis in a command such as |
3124 @code{forward-sexp}, and by @file{font-lock.c} to locate quoted strings, | 3125 @code{forward-sexp}, and by @file{font-lock.c} to locate quoted strings, |
3125 comments, etc. | 3126 comments, etc. |
3126 | 3127 |
3127 | 3128 |
3128 | 3129 |
3680 two-dimensional set of characters, such as US ASCII or JISX0208 Japanese | 3681 two-dimensional set of characters, such as US ASCII or JISX0208 Japanese |
3681 Kanji). | 3682 Kanji). |
3682 | 3683 |
3683 @file{mule-coding.*} implements the @dfn{coding-system} Lisp object | 3684 @file{mule-coding.*} implements the @dfn{coding-system} Lisp object |
3684 type, which encapsulates a method of converting between different | 3685 type, which encapsulates a method of converting between different |
3685 encodings. An encoding is a representation of a stream of characters | 3686 encodings. An encoding is a representation of a stream of characters, |
3686 from multiple character sets using a stream of bytes or words and | 3687 possibly from multiple character sets, using a stream of bytes or words, |
3687 defines (e.g.) which escape sequences are used to specify particular | 3688 and defines (e.g.) which escape sequences are used to specify particular |
3688 character sets, how the indices for a character are converted into bytes | 3689 character sets, how the indices for a character are converted into bytes |
3689 (sometimes this involves setting the high bit; sometimes complicated | 3690 (sometimes this involves setting the high bit; sometimes complicated |
3690 rearranging of the values takes place, as in the Shift-JIS encoding), | 3691 rearranging of the values takes place, as in the Shift-JIS encoding), |
3691 etc. | 3692 etc. |
3692 | 3693 |
3694 interpreter. CCL is similar in spirit to Lisp byte code and is used to | 3695 interpreter. CCL is similar in spirit to Lisp byte code and is used to |
3695 implement converters for custom encodings. | 3696 implement converters for custom encodings. |
3696 | 3697 |
3697 @file{mule-canna.c} and @file{mule-wnnfns.c} implement interfaces to | 3698 @file{mule-canna.c} and @file{mule-wnnfns.c} implement interfaces to |
3698 external programs used to implement the Canna and WNN input methods, | 3699 external programs used to implement the Canna and WNN input methods, |
3699 respectively. This is currently broken. | 3700 respectively. This is currently in beta. |
3700 | 3701 |
3701 @file{mule-mcpatch.c} provides some functions to allow for pathnames | 3702 @file{mule-mcpath.c} provides some functions to allow for pathnames |
3702 containing extended characters. This code is fragmentary and completely | 3703 containing extended characters. This code is fragmentary, obsolete, and |
3703 non-working. | 3704 completely non-working. Instead, @var{pathname-coding-system} is used |
3705 to specify conversions of names of files and directories. The standard | |
3706 C I/O functions like @samp{open()} are wrapped so that conversion occurs | |
3707 automatically. | |
3704 | 3708 |
3705 @file{mule.c} provides a few miscellaneous things that should probably | 3709 @file{mule.c} provides a few miscellaneous things that should probably |
3706 be elsewhere. | 3710 be elsewhere. |
3707 | 3711 |
3708 | 3712 |
3777 @itemize @bullet | 3781 @itemize @bullet |
3778 @item | 3782 @item |
3779 (a) Those for whom the value directly represents the contents of the | 3783 (a) Those for whom the value directly represents the contents of the |
3780 Lisp object. Only two types are in this category: integers and | 3784 Lisp object. Only two types are in this category: integers and |
3781 characters. No special allocation or garbage collection is necessary | 3785 characters. No special allocation or garbage collection is necessary |
3782 for such objects. | 3786 for such objects. Lisp objects of these types do not need to be |
3787 @code{GCPRO}ed. | |
3783 @end itemize | 3788 @end itemize |
3784 | 3789 |
3785 In the remaining three categories, the value is a pointer to a | 3790 In the remaining three categories, the value is a pointer to a |
3786 structure. | 3791 structure. |
3787 | 3792 |
3947 Note that @code{obarray} is one of the @code{staticpro()}d things. | 3952 Note that @code{obarray} is one of the @code{staticpro()}d things. |
3948 Therefore, all functions and variables get marked through this. | 3953 Therefore, all functions and variables get marked through this. |
3949 @item | 3954 @item |
3950 Any shadowed bindings that are sitting on the specpdl stack. | 3955 Any shadowed bindings that are sitting on the specpdl stack. |
3951 @item | 3956 @item |
3952 Any objects sitting in currently active stack frames, | 3957 Any objects sitting in currently active (Lisp) stack frames, |
3953 catches, and condition cases. | 3958 catches, and condition cases. |
3954 @item | 3959 @item |
3955 A couple of special-case places where active objects are | 3960 A couple of special-case places where active objects are |
3956 located. | 3961 located. |
3957 @item | 3962 @item |
3996 just a single lvalue. To effect this, call @code{GCPRO@var{n}} as usual on | 4001 just a single lvalue. To effect this, call @code{GCPRO@var{n}} as usual on |
3997 the first object in the array and then set @code{gcpron.nvars}. | 4002 the first object in the array and then set @code{gcpron.nvars}. |
3998 | 4003 |
3999 @item | 4004 @item |
4000 @strong{Strings are relocated.} What this means in practice is that the | 4005 @strong{Strings are relocated.} What this means in practice is that the |
4001 pointer obtained using @code{string_data()} is liable to change at any | 4006 pointer obtained using @code{XSTRING_DATA()} is liable to change at any |
4002 time, and you should never keep it around past any function call, or | 4007 time, and you should never keep it around past any function call, or |
4003 pass it as an argument to any function that might cause a garbage | 4008 pass it as an argument to any function that might cause a garbage |
4004 collection. This is why a number of functions accept either a | 4009 collection. This is why a number of functions accept either a |
4005 ``non-relocatable'' @code{char *} pointer or a relocatable Lisp string, | 4010 ``non-relocatable'' @code{char *} pointer or a relocatable Lisp string, |
4006 and only access the Lisp string's data at the very last minute. In some | 4011 and only access the Lisp string's data at the very last minute. In some |
4037 If you have the @emph{least smidgeon of doubt} about whether | 4042 If you have the @emph{least smidgeon of doubt} about whether |
4038 you need to @code{GCPRO}, you should @code{GCPRO}. | 4043 you need to @code{GCPRO}, you should @code{GCPRO}. |
4039 | 4044 |
4040 @item | 4045 @item |
4041 Beware of @code{GCPRO}ing something that is uninitialized. If you have | 4046 Beware of @code{GCPRO}ing something that is uninitialized. If you have |
4042 any shade of doubt about this, initialize all your variables to Qnil. | 4047 any shade of doubt about this, initialize all your variables to @code{Qnil}. |
4043 | 4048 |
4044 @item | 4049 @item |
4045 Be careful of traps, like calling @code{Fcons()} in the argument to | 4050 Be careful of traps, like calling @code{Fcons()} in the argument to |
4046 another function. By the ``caller protects'' law, you should be | 4051 another function. By the ``caller protects'' law, you should be |
4047 @code{GCPRO}ing the newly-created cons, but you aren't. A certain | 4052 @code{GCPRO}ing the newly-created cons, but you aren't. A certain |
4446 @section Vector | 4451 @section Vector |
4447 | 4452 |
4448 As mentioned above, each vector is @code{malloc()}ed individually, and | 4453 As mentioned above, each vector is @code{malloc()}ed individually, and |
4449 all are threaded through the variable @code{all_vectors}. Vectors are | 4454 all are threaded through the variable @code{all_vectors}. Vectors are |
4450 marked strangely during garbage collection, by kludging the size field. | 4455 marked strangely during garbage collection, by kludging the size field. |
4451 Note that the @code{struct Lisp_Vector} is declared with its contents | 4456 Note that the @code{struct Lisp_Vector} is declared with its |
4452 being an array of one element. It is actually @code{malloc()}ed with | 4457 @code{contents} field being a @emph{stretchy} array of one element. It |
4453 the right size, however, and access to any element through the contents | 4458 is actually @code{malloc()}ed with the right size, however, and access |
4454 array works fine. | 4459 to any element through the @code{contents} array works fine. |
4455 | 4460 |
4456 @node Bit Vector | 4461 @node Bit Vector |
4457 @section Bit Vector | 4462 @section Bit Vector |
4458 | 4463 |
4459 Bit vectors work exactly like vectors, except for more complicated | 4464 Bit vectors work exactly like vectors, except for more complicated |
4932 @code{command_event_queue}. There is a comment about a ``race | 4937 @code{command_event_queue}. There is a comment about a ``race |
4933 condition'', which is not a good sign. | 4938 condition'', which is not a good sign. |
4934 | 4939 |
4935 @code{next-command-event} and @code{read-char} are higher-level | 4940 @code{next-command-event} and @code{read-char} are higher-level |
4936 interfaces to @code{next-event}. @code{next-command-event} gets the | 4941 interfaces to @code{next-event}. @code{next-command-event} gets the |
4937 next @dfn{command} event (i.e. keypress, mouse event, or menu | 4942 next @dfn{command} event (i.e. keypress, mouse event, menu selection, |
4938 selection), calling dispatch-event on any others. @code{read-char} | 4943 or scrollbar action), calling @code{dispatch-event} on any others. |
4939 calls @code{next-command-event} and uses @code{event_to_character()} to | 4944 @code{read-char} calls @code{next-command-event} and uses |
4940 return the ASCII equivalent. | 4945 @code{event_to_character()} to return the character equivalent. With |
4946 the right kind of input method support, it is possible for (read-char) | |
4947 to return a Kanji character. | |
4941 | 4948 |
4942 @node Converting Events | 4949 @node Converting Events |
4943 @section Converting Events | 4950 @section Converting Events |
4944 | 4951 |
4945 @code{character_to_event()}, @code{event_to_character()}, | 4952 @code{character_to_event()}, @code{event_to_character()}, |
4946 @code{event-to-character}, and @code{character-to-event} convert between | 4953 @code{event-to-character}, and @code{character-to-event} convert between |
4947 ASCII characters and keypresses corresponding to the characters. If the | 4954 characters and keypress events corresponding to the characters. If the |
4948 event was not a keypress, @code{event_to_character()} returns -1 and | 4955 event was not a keypress, @code{event_to_character()} returns -1 and |
4949 @code{event-to-character} returns @code{nil}. These functions convert | 4956 @code{event-to-character} returns @code{nil}. These functions convert |
4950 between ASCII representation and the split-up event representation | 4957 between character representation and the split-up event representation |
4951 (keysym plus mod keys). | 4958 (keysym plus mod keys). |
4952 | 4959 |
4953 @node Dispatching Events; The Command Builder | 4960 @node Dispatching Events; The Command Builder |
4954 @section Dispatching Events; The Command Builder | 4961 @section Dispatching Events; The Command Builder |
4955 | 4962 |
4991 the backtrace structure is changed). | 4998 the backtrace structure is changed). |
4992 | 4999 |
4993 At this point, the function to be called is determined by looking at | 5000 At this point, the function to be called is determined by looking at |
4994 the car of the cons (if this is a symbol, its function definition is | 5001 the car of the cons (if this is a symbol, its function definition is |
4995 retrieved and the process repeated). The function should then consist | 5002 retrieved and the process repeated). The function should then consist |
4996 of either a Lisp_Subr (built-in function), a Lisp_Compiled object, or a | 5003 of either a @code{Lisp_Subr} (built-in function), a |
4997 cons whose car is the symbol @code{autoload}, @code{macro}, | 5004 @code{Lisp_Compiled_Function} object, or a cons whose car is the symbol |
4998 @code{lambda}, or @code{mocklisp}. | 5005 @code{autoload}, @code{macro}, @code{lambda}, or @code{mocklisp}. |
4999 | 5006 |
5000 If the function is a Lisp_Subr, the lisp object points to a struct | 5007 If the function is a @code{Lisp_Subr}, the lisp object points to a |
5001 Lisp_Subr (created by @code{DEFUN()}), which contains a pointer to the C | 5008 @code{struct Lisp_Subr} (created by @code{DEFUN()}), which contains a |
5002 function, a minimum and maximum number of arguments (possibly the | 5009 pointer to the C function, a minimum and maximum number of arguments |
5003 special constants @code{MANY} or @code{UNEVALLED}), a pointer to the | 5010 (possibly the special constants @code{MANY} or @code{UNEVALLED}), a |
5004 symbol referring to that subr, and a couple of other things. If the | 5011 pointer to the symbol referring to that subr, and a couple of other |
5005 subr wants its arguments @code{UNEVALLED}, they are passed raw as a | 5012 things. If the subr wants its arguments @code{UNEVALLED}, they are |
5006 list. Otherwise, an array of evaluated arguments is created and put | 5013 passed raw as a list. Otherwise, an array of evaluated arguments is |
5007 into the backtrace structure, and either passed whole (@code{MANY}) or | 5014 created and put into the backtrace structure, and either passed whole |
5008 each argument is passed as a C argument. | 5015 (@code{MANY}) or each argument is passed as a C argument. |
5009 | 5016 |
5010 If the function is a Lisp_Compiled object or a lambda, | 5017 If the function is a @code{Lisp_Compiled_Function} object or a lambda, |
5011 @code{apply_lambda()} is called. If the function is a macro, | 5018 @code{apply_lambda()} is called. If the function is a macro, |
5012 [..... fill in] is done. If the function is an autoload, | 5019 [..... fill in] is done. If the function is an autoload, |
5013 @code{do_autoload()} is called to load the definition and then eval | 5020 @code{do_autoload()} is called to load the definition and then eval |
5014 starts over [explain this more]. If the function is a mocklisp, | 5021 starts over [explain this more]. If the function is a mocklisp, |
5015 @code{ml_apply()} is called. | 5022 @code{ml_apply()} is called. |
5025 | 5032 |
5026 @code{funcall_lambda()} goes through the formal arguments to the | 5033 @code{funcall_lambda()} goes through the formal arguments to the |
5027 function and binds them to the actual arguments, checking for | 5034 function and binds them to the actual arguments, checking for |
5028 @code{&rest} and @code{&optional} symbols in the formal arguments and | 5035 @code{&rest} and @code{&optional} symbols in the formal arguments and |
5029 making sure the number of actual arguments is correct. Then either | 5036 making sure the number of actual arguments is correct. Then either |
5030 progn or byte-code is called to actually execute the body and return a | 5037 @code{progn} or @code{byte-code} is called to actually execute the body |
5031 value. | 5038 and return a value. |
5032 | 5039 |
5033 @code{Ffuncall()} implements Lisp @code{funcall}. @code{(funcall fun | 5040 @code{Ffuncall()} implements Lisp @code{funcall}. @code{(funcall fun |
5034 x1 x2 x3 ...)} is equivalent to @code{(eval (list fun (quote x1) (quote | 5041 x1 x2 x3 ...)} is equivalent to @code{(eval (list fun (quote x1) (quote |
5035 x2) (quote x3) ...))}. @code{Ffuncall()} contains its own code to do | 5042 x2) (quote x3) ...))}. @code{Ffuncall()} contains its own code to do |
5036 the evaluation, however, and is almost identical to eval. | 5043 the evaluation, however, and is almost identical to eval. |
5074 specpdl array, and @code{specpdl_size} is increased by 1. | 5081 specpdl array, and @code{specpdl_size} is increased by 1. |
5075 | 5082 |
5076 @code{record_unwind_protect()} implements an @dfn{unwind-protect}, | 5083 @code{record_unwind_protect()} implements an @dfn{unwind-protect}, |
5077 which, when placed around a section of code, ensures that some specified | 5084 which, when placed around a section of code, ensures that some specified |
5078 cleanup routine will be executed even if the code exits abnormally | 5085 cleanup routine will be executed even if the code exits abnormally |
5079 (e.g. through a throw or quit). @code{record_unwind_protect()} simply | 5086 (e.g. through a @code{throw} or quit). @code{record_unwind_protect()} |
5080 adds a new specbinding to the specpdl array and stores the appropriate | 5087 simply adds a new specbinding to the specpdl array and stores the |
5081 information in it. The cleanup routine can either be a C function, | 5088 appropriate information in it. The cleanup routine can either be a C |
5082 which is stored in the @code{func} field, or a progn form, which is stored in | 5089 function, which is stored in the @code{func} field, or a @code{progn} |
5083 the @code{old_value} field. | 5090 form, which is stored in the @code{old_value} field. |
5084 | 5091 |
5085 @code{unbind_to()} removes specbindings from the specpdl array until | 5092 @code{unbind_to()} removes specbindings from the specpdl array until |
5086 the specified position is reached. The specbinding can be one of three | 5093 the specified position is reached. Each specbinding can be one of three |
5087 types: | 5094 types: |
5088 | 5095 |
5089 @enumerate | 5096 @enumerate |
5090 @item | 5097 @item |
5091 an unwind-protect with a C cleanup function (@code{func} is not 0 -- | 5098 an unwind-protect with a C cleanup function (@code{func} is not 0, and |
5092 @code{old_value} holds an argument to be passed to the function); | 5099 @code{old_value} holds an argument to be passed to the function); |
5093 @item | 5100 @item |
5094 an unwind-protect with a Lisp form (@code{func} is 0 and @code{symbol} | 5101 an unwind-protect with a Lisp form (@code{func} is 0, @code{symbol} |
5095 is @code{nil} -- @code{old_value} holds the form to be executed with | 5102 is @code{nil}, and @code{old_value} holds the form to be executed with |
5096 @code{Fprogn()}); or | 5103 @code{Fprogn()}); or |
5097 @item | 5104 @item |
5098 a local-variable binding (@code{func} is 0 and @code{symbol} is not | 5105 a local-variable binding (@code{func} is 0, @code{symbol} is not |
5099 @code{nil} -- @code{old_value} holds the old value, which is stored as | 5106 @code{nil}, and @code{old_value} holds the old value, which is stored as |
5100 the symbol's value). | 5107 the symbol's value). |
5101 @end enumerate | 5108 @end enumerate |
5102 | 5109 |
5103 @node Simple Special Forms | 5110 @node Simple Special Forms |
5104 @section Simple Special Forms | 5111 @section Simple Special Forms |
5255 | 5262 |
5256 Usually symbols are created by @code{intern}, but if you really want, | 5263 Usually symbols are created by @code{intern}, but if you really want, |
5257 you can explicitly create a symbol using @code{make-symbol}, giving it | 5264 you can explicitly create a symbol using @code{make-symbol}, giving it |
5258 some name. The resulting symbol is not in any obarray (i.e. it is | 5265 some name. The resulting symbol is not in any obarray (i.e. it is |
5259 @dfn{uninterned}), and you can't add it to any obarray. Therefore its | 5266 @dfn{uninterned}), and you can't add it to any obarray. Therefore its |
5260 primary purpose is as a carrier of information. (Cons cells could | 5267 primary purpose is as a symbol to use in macros to avoid namespace |
5261 probably be used just as well.) | 5268 pollution. It can also be used as a carrier of information, but cons |
5269 cells could probably be used just as well. | |
5262 | 5270 |
5263 You can also use @code{intern-soft} to look up a symbol but not create | 5271 You can also use @code{intern-soft} to look up a symbol but not create |
5264 a new one, and @code{unintern} to remove a symbol from an obarray. This | 5272 a new one, and @code{unintern} to remove a symbol from an obarray. This |
5265 returns the removed symbol. (Remember: You can't put the symbol back | 5273 returns the removed symbol. (Remember: You can't put the symbol back |
5266 into any obarray.) Finally, @code{mapatoms} maps over all of the symbols | 5274 into any obarray.) Finally, @code{mapatoms} maps over all of the symbols |
5328 In this, it is like a string, but a buffer is optimized for | 5336 In this, it is like a string, but a buffer is optimized for |
5329 frequent insertion and deletion, while a string is not. Furthermore: | 5337 frequent insertion and deletion, while a string is not. Furthermore: |
5330 | 5338 |
5331 @enumerate | 5339 @enumerate |
5332 @item | 5340 @item |
5333 Buffers are @dfn{permanent} objects, i.e. one you create them, they | 5341 Buffers are @dfn{permanent} objects, i.e. once you create them, they |
5334 remain around, and need to be explicitly deleted before they go away. | 5342 remain around, and need to be explicitly deleted before they go away. |
5335 @item | 5343 @item |
5336 Each buffer has a unique name, which is a string. Buffers are | 5344 Each buffer has a unique name, which is a string. Buffers are |
5337 normally referred to by name. In this respect, they are like | 5345 normally referred to by name. In this respect, they are like |
5338 symbols. | 5346 symbols. |
5362 can temporarily change the current buffer using @code{set-buffer} (often | 5370 can temporarily change the current buffer using @code{set-buffer} (often |
5363 enclosed in a @code{save-excursion} so that the former current buffer | 5371 enclosed in a @code{save-excursion} so that the former current buffer |
5364 gets restored when the code is finished). However, calling | 5372 gets restored when the code is finished). However, calling |
5365 @code{set-buffer} will NOT cause a permanent change in the current | 5373 @code{set-buffer} will NOT cause a permanent change in the current |
5366 buffer. The reason for this is that the top-level event loop sets | 5374 buffer. The reason for this is that the top-level event loop sets |
5367 current buffer to the buffer of the selected window, each time it | 5375 @code{current_buffer} to the buffer of the selected window, each time |
5368 finishes executing a user command. | 5376 it finishes executing a user command. |
5369 @end enumerate | 5377 @end enumerate |
5370 | 5378 |
5371 Make sure you understand the distinction between @dfn{current buffer} | 5379 Make sure you understand the distinction between @dfn{current buffer} |
5372 and @dfn{buffer of the selected window}, and the distinction between | 5380 and @dfn{buffer of the selected window}, and the distinction between |
5373 @dfn{point} of the current buffer and @dfn{window-point} of the selected | 5381 @dfn{point} of the current buffer and @dfn{window-point} of the selected |
5386 etc.), Cyrillic and Greek letters, etc. The actual number of possible | 5394 etc.), Cyrillic and Greek letters, etc. The actual number of possible |
5387 characters is quite large. | 5395 characters is quite large. |
5388 | 5396 |
5389 For now, we can view a character as some non-negative integer that | 5397 For now, we can view a character as some non-negative integer that |
5390 has some shape that defines how it typically appears (e.g. as an | 5398 has some shape that defines how it typically appears (e.g. as an |
5391 uppercase A). (The exact way in which a character appears depends | 5399 uppercase A). (The exact way in which a character appears depends on the |
5392 on the font of the character.) The internal type of characters in | 5400 font used to display the character.) The internal type of characters in |
5393 the C code is an Emchar; this is just an int, but using a symbolic | 5401 the C code is an @code{Emchar}; this is just an @code{int}, but using a |
5394 type makes the code clearer. | 5402 symbolic type makes the code clearer. |
5395 | 5403 |
5396 Between every character in a buffer is a @dfn{buffer position} or | 5404 Between every character in a buffer is a @dfn{buffer position} or |
5397 @dfn{character position}. We can speak of the character before or after | 5405 @dfn{character position}. We can speak of the character before or after |
5398 a particular buffer position, and when you insert a character at a | 5406 a particular buffer position, and when you insert a character at a |
5399 particular position, all characters after that position end up at new | 5407 particular position, all characters after that position end up at new |
5445 characters back again). Once the buffer is killed, the memory allocated | 5453 characters back again). Once the buffer is killed, the memory allocated |
5446 for the buffer text will be freed, but it will still be sitting on the | 5454 for the buffer text will be freed, but it will still be sitting on the |
5447 heap, taking up virtual memory, and will not be released back to the | 5455 heap, taking up virtual memory, and will not be released back to the |
5448 operating system. (However, if you have compiled XEmacs with rel-alloc, | 5456 operating system. (However, if you have compiled XEmacs with rel-alloc, |
5449 the situation is different. In this case, the space @emph{will} be | 5457 the situation is different. In this case, the space @emph{will} be |
5450 released back to the operating system. However, this tends to effect a | 5458 released back to the operating system. However, this tends to result in a |
5451 noticeable speed penalty.) | 5459 noticeable speed penalty.) |
5452 | 5460 |
5453 Astute readers may notice that the text in a buffer is represented as | 5461 Astute readers may notice that the text in a buffer is represented as |
5454 an array of @emph{bytes}, while (at least in the MULE case) an Emchar is | 5462 an array of @emph{bytes}, while (at least in the MULE case) an Emchar is |
5455 a 19-bit integer, which clearly cannot fit in a byte. This means (of | 5463 a 19-bit integer, which clearly cannot fit in a byte. This means (of |
5498 @dfn{byte indices}, typedef @code{Bytind} | 5506 @dfn{byte indices}, typedef @code{Bytind} |
5499 @item | 5507 @item |
5500 @dfn{memory indices}, typedef @code{Memind} | 5508 @dfn{memory indices}, typedef @code{Memind} |
5501 @end enumerate | 5509 @end enumerate |
5502 | 5510 |
5503 All three typedefs are just ints, but defining them this way makes | 5511 All three typedefs are just @code{int}s, but defining them this way makes |
5504 things a lot clearer. | 5512 things a lot clearer. |
5505 | 5513 |
5506 Most code works with buffer positions. In particular, all Lisp code | 5514 Most code works with buffer positions. In particular, all Lisp code |
5507 that refers to text in a buffer uses buffer positions. Lisp code does | 5515 that refers to text in a buffer uses buffer positions. Lisp code does |
5508 not know that byte indices or memory indices exist. | 5516 not know that byte indices or memory indices exist. |
5509 | 5517 |
5510 Finally, we have a typedef for the bytes in a buffer. This is a | 5518 Finally, we have a typedef for the bytes in a buffer. This is a |
5511 @code{Bufbyte}, which is an unsigned char. Referring to them as | 5519 @code{Bufbyte}, which is an unsigned char. Referring to them as |
5512 Bufbytes underscores the fact that we are working with a string of bytes | 5520 Bufbytes underscores the fact that we are working with a string of bytes |
5513 in the internal Emacs buffer representation rather than in one of a | 5521 in the internal Emacs buffer representation rather than in one of a |
5514 number of possible alternative representations (e.g. EUC-coded text, | 5522 number of possible alternative representations (e.g. EUC-encoded text, |
5515 etc.). | 5523 etc.). |
5516 | 5524 |
5517 @node Buffer Lists | 5525 @node Buffer Lists |
5518 @section Buffer Lists | 5526 @section Buffer Lists |
5519 | 5527 |
5823 @end menu | 5831 @end menu |
5824 | 5832 |
5825 @node Japanese EUC (Extended Unix Code) | 5833 @node Japanese EUC (Extended Unix Code) |
5826 @subsection Japanese EUC (Extended Unix Code) | 5834 @subsection Japanese EUC (Extended Unix Code) |
5827 | 5835 |
5828 This encompasses the character sets Printing-ASCII, Japanese (aka | 5836 This encompasses the character sets Printing-ASCII, Japanese-JISSX0201, |
5829 JISX0208), and Japanese-Kana (half-width katakana, the right half of | 5837 and Japanese-JISX0208-Kana (half-width katakana, the right half of |
5830 JISX0201). It uses 8-bit bytes. | 5838 JISX0201). It uses 8-bit bytes. |
5831 | 5839 |
5832 Note that Printing-ASCII and Japanese-Kana are 94-character charsets, | 5840 Note that Printing-ASCII and Japanese-JISX0201-Kana are 94-character |
5833 while Japanese is a 94x94-character charset. | 5841 charsets, while Japanese-JISX0208 is a 94x94-character charset. |
5834 | 5842 |
5835 The encoding is as follows: | 5843 The encoding is as follows: |
5836 | 5844 |
5837 @example | 5845 @example |
5838 Character set Representation (PC=position-code) | 5846 Character set Representation (PC=position-code) |
5839 ------------- -------------- | 5847 ------------- -------------- |
5840 Printing-ASCII PC1 | 5848 Printing-ASCII PC1 |
5841 Japanese PC1 + 0x80 | PC2 + 0x80 | 5849 Japanese-JISX0201-Kana 0x8E | PC1 + 0x80 |
5842 Japanese-Kana 0x8E | PC1 + 0x80 | 5850 Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80 |
5851 Japanese-JISX0212 PC1 + 0x80 | PC2 + 0x80 | |
5843 @end example | 5852 @end example |
5844 | 5853 |
5845 | 5854 |
5846 @node JIS7 | 5855 @node JIS7 |
5847 @subsection JIS7 | 5856 @subsection JIS7 |
5848 | 5857 |
5849 This encompasses the character sets Printing-ASCII, | 5858 This encompasses the character sets Printing-ASCII, |
5850 Japanese-Roman (the left half of JISX0201; this character | 5859 Japanese-JISX0201-Roman (the left half of JISX0201; this character set |
5851 set is very similar to Printing-ASCII and is a 94-character | 5860 is very similar to Printing-ASCII and is a 94-character charset), |
5852 charset), Japanese, and Japanese-Kana. It uses 7-bit bytes. | 5861 Japanese-JISX0208, and Japanese-JISX0201-Kana. It uses 7-bit bytes. |
5853 | 5862 |
5854 Unlike Japanese EUC, this is a @dfn{modal} encoding, which | 5863 Unlike Japanese EUC, this is a @dfn{modal} encoding, which |
5855 means that there are multiple states that the encoding can | 5864 means that there are multiple states that the encoding can |
5856 be in, which affect how the bytes are to be interpreted. | 5865 be in, which affect how the bytes are to be interpreted. |
5857 Special sequences of bytes (called @dfn{escape sequences}) | 5866 Special sequences of bytes (called @dfn{escape sequences}) |
5858 are used to change states. | 5867 are used to change states. |
5859 | 5868 |
5860 The encoding is as follows: | 5869 The encoding is as follows: |
5861 | 5870 |
5862 @example | 5871 @example |
5863 Character set Representation (PC=position-code) | 5872 Character set Representation (PC=position-code) |
5864 ------------- -------------- | 5873 ------------- -------------- |
5865 Printing-ASCII PC1 | 5874 Printing-ASCII PC1 |
5866 Japanese-Roman PC1 | 5875 Japanese-JISX0201-Roman PC1 |
5867 Japanese PC1 PC2 | 5876 Japanese-JISX0201-Kana PC1 |
5868 Japanese-Kana PC1 | 5877 Japanese-JISX0208 PC1 PC2 |
5869 | 5878 |
5870 | 5879 |
5871 Escape sequence ASCII equivalent Meaning | 5880 Escape sequence ASCII equivalent Meaning |
5872 --------------- ---------------- ------- | 5881 --------------- ---------------- ------- |
5873 0x1B 0x28 0x4A ESC ( J invoke Japanese-Roman | 5882 0x1B 0x28 0x4A ESC ( J invoke Japanese-JISX0201-Roman |
5874 0x1B 0x24 0x42 ESC $ B invoke Japanese | 5883 0x1B 0x28 0x49 ESC ( I invoke Japanese-JISX0201-Kana |
5875 0x1B 0x28 0x49 ESC ( I invoke Japanese-Kana | 5884 0x1B 0x24 0x42 ESC $ B invoke Japanese-JISX0208 |
5876 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII | 5885 0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII |
5877 @end example | 5886 @end example |
5878 | 5887 |
5879 Initially, Printing-ASCII is invoked. | 5888 Initially, Printing-ASCII is invoked. |
5880 | 5889 |
5881 @node Internal Mule Encodings | 5890 @node Internal Mule Encodings |
5882 @section Internal Mule Encodings | 5891 @section Internal Mule Encodings |
5883 | 5892 |
5884 In XEmacs/Mule, each character set is assigned a unique number, | 5893 In XEmacs/Mule, each character set is assigned a unique number, called a |
5885 called a @dfn{leading byte}. This is used in the encodings of a | 5894 @dfn{leading byte}. This is used in the encodings of a character. |
5886 character. Leading bytes are in the range 0x80 - 0xFF | 5895 Leading bytes are in the range 0x80 - 0xFF (except for ASCII, which has |
5887 (except for ASCII, which has a leading byte of 0), although | 5896 a leading byte of 0), although some leading bytes are reserved. |
5888 some leading bytes are reserved. | 5897 |
5889 | 5898 Charsets whose leading byte is in the range 0x80 - 0x9F are called |
5890 Charsets whose leading byte is in the range 0x80 - 0x9F are | 5899 @dfn{official} and are used for built-in charsets. Other charsets are |
5891 called @dfn{official} and are used for built-in charsets. | 5900 called @dfn{private} and have leading bytes in the range 0xA0 - 0xFF; |
5892 Other charsets are called @dfn{private} and have leading bytes | 5901 these are user-defined charsets. |
5893 in the range 0xA0 - 0xFF; these are user-defined charsets. | |
5894 | 5902 |
5895 More specifically: | 5903 More specifically: |
5896 | 5904 |
5897 @example | 5905 @example |
5898 Character set Leading byte | 5906 Character set Leading byte |
5907 0x9E and 0x9F are reserved) | 5915 0x9E and 0x9F are reserved) |
5908 Dimension-1 Private 0xA0 - 0xEF | 5916 Dimension-1 Private 0xA0 - 0xEF |
5909 Dimension-2 Private 0xF0 - 0xFF | 5917 Dimension-2 Private 0xF0 - 0xFF |
5910 @end example | 5918 @end example |
5911 | 5919 |
5912 There are two internal encodings for characters in XEmacs/Mule. One | 5920 There are two internal encodings for characters in XEmacs/Mule. One is |
5913 is called @dfn{string encoding} and is an 8-bit encoding that is used | 5921 called @dfn{string encoding} and is an 8-bit encoding that is used for |
5914 for representing characters in a buffer or string. It uses 1 to 4 bytes | 5922 representing characters in a buffer or string. It uses 1 to 4 bytes per |
5915 per character. The other is called @dfn{character encoding} and is a | 5923 character. The other is called @dfn{character encoding} and is a 19-bit |
5916 19-bit encoding that is used for representing characters individually in | 5924 encoding that is used for representing characters individually in a |
5917 a variable. | 5925 variable. |
5918 | 5926 |
5919 (In the following descriptions, we'll ignore composite | 5927 (In the following descriptions, we'll ignore composite characters for |
5920 characters for the moment. We also give a general (structural) | 5928 the moment. We also give a general (structural) overview first, |
5921 overview first, followed later by the exact details.) | 5929 followed later by the exact details.) |
5922 | 5930 |
5923 @menu | 5931 @menu |
5924 * Internal String Encoding:: | 5932 * Internal String Encoding:: |
5925 * Internal Character Encoding:: | 5933 * Internal Character Encoding:: |
5926 @end menu | 5934 @end menu |
5927 | 5935 |
5928 @node Internal String Encoding | 5936 @node Internal String Encoding |
5929 @subsection Internal String Encoding | 5937 @subsection Internal String Encoding |
5930 | 5938 |
5931 ASCII characters are encoded using their position code directly. | 5939 ASCII characters are encoded using their position code directly. Other |
5932 Other characters are encoded using their leading byte followed | 5940 characters are encoded using their leading byte followed by their |
5933 by their position code(s) with the high bit set. Characters | 5941 position code(s) with the high bit set. Characters in private character |
5934 in private character sets have their leading byte prefixed with | 5942 sets have their leading byte prefixed with a @dfn{leading byte prefix}, |
5935 a @dfn{leading byte prefix}, which is either 0x9E or 0x9F. (No | 5943 which is either 0x9E or 0x9F. (No character sets are ever assigned these |
5936 character sets are ever assigned these leading bytes.) Specifically: | 5944 leading bytes.) Specifically: |
5937 | 5945 |
5938 @example | 5946 @example |
5939 Character set Encoding (PC=position-code, LB=leading-byte) | 5947 Character set Encoding (PC=position-code, LB=leading-byte) |
5940 ------------- -------- | 5948 ------------- -------- |
5941 ASCII PC-1 | | 5949 ASCII PC-1 | |