0
+ − 1 @c -*-texinfo-*-
+ − 2 @c This is part of the XEmacs Lisp Reference Manual.
444
+ − 3 @c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc.
0
+ − 4 @c See the file lispref.texi for copying conditions.
+ − 5 @setfilename ../../info/syntax.info
+ − 6 @node Syntax Tables, Abbrevs, Searching and Matching, Top
+ − 7 @chapter Syntax Tables
+ − 8 @cindex parsing
+ − 9 @cindex syntax table
+ − 10 @cindex text parsing
+ − 11
+ − 12 A @dfn{syntax table} specifies the syntactic textual function of each
+ − 13 character. This information is used by the parsing commands, the
+ − 14 complex movement commands, and others to determine where words, symbols,
+ − 15 and other syntactic constructs begin and end. The current syntax table
+ − 16 controls the meaning of the word motion functions (@pxref{Word Motion})
+ − 17 and the list motion functions (@pxref{List Motion}) as well as the
+ − 18 functions in this chapter.
+ − 19
+ − 20 @menu
+ − 21 * Basics: Syntax Basics. Basic concepts of syntax tables.
+ − 22 * Desc: Syntax Descriptors. How characters are classified.
+ − 23 * Syntax Table Functions:: How to create, examine and alter syntax tables.
+ − 24 * Motion and Syntax:: Moving over characters with certain syntaxes.
+ − 25 * Parsing Expressions:: Parsing balanced expressions
+ − 26 using the syntax table.
+ − 27 * Standard Syntax Tables:: Syntax tables used by various major modes.
+ − 28 * Syntax Table Internals:: How syntax table information is stored.
+ − 29 @end menu
+ − 30
+ − 31 @node Syntax Basics
+ − 32 @section Syntax Table Concepts
+ − 33
+ − 34 @ifinfo
+ − 35 A @dfn{syntax table} provides Emacs with the information that
+ − 36 determines the syntactic use of each character in a buffer. This
+ − 37 information is used by the parsing commands, the complex movement
+ − 38 commands, and others to determine where words, symbols, and other
+ − 39 syntactic constructs begin and end. The current syntax table controls
+ − 40 the meaning of the word motion functions (@pxref{Word Motion}) and the
+ − 41 list motion functions (@pxref{List Motion}) as well as the functions in
+ − 42 this chapter.
+ − 43 @end ifinfo
+ − 44
1024
+ − 45 Under XEmacs 20 and later, a syntax table is a particular subtype of the
0
+ − 46 primitive char table type (@pxref{Char Tables}), and each element of the
+ − 47 char table is an integer that encodes the syntax of the character in
+ − 48 question, or a cons of such an integer and a matching character (for
+ − 49 characters with parenthesis syntax).
+ − 50
+ − 51 Under XEmacs 19, a syntax table is a vector of 256 elements; it
+ − 52 contains one entry for each of the 256 possible characters in an 8-bit
+ − 53 byte. Each element is an integer that encodes the syntax of the
+ − 54 character in question. (The matching character, if any, is embedded
+ − 55 in the bits of this integer.)
+ − 56
+ − 57 Syntax tables are used only for moving across text, not for the Emacs
+ − 58 Lisp reader. XEmacs Lisp uses built-in syntactic rules when reading Lisp
+ − 59 expressions, and these rules cannot be changed.
+ − 60
+ − 61 Each buffer has its own major mode, and each major mode has its own
+ − 62 idea of the syntactic class of various characters. For example, in Lisp
+ − 63 mode, the character @samp{;} begins a comment, but in C mode, it
+ − 64 terminates a statement. To support these variations, XEmacs makes the
+ − 65 choice of syntax table local to each buffer. Typically, each major
+ − 66 mode has its own syntax table and installs that table in each buffer
+ − 67 that uses that mode. Changing this table alters the syntax in all
+ − 68 those buffers as well as in any buffers subsequently put in that mode.
+ − 69 Occasionally several similar modes share one syntax table.
+ − 70 @xref{Example Major Modes}, for an example of how to set up a syntax
+ − 71 table.
+ − 72
+ − 73 A syntax table can inherit the data for some characters from the
+ − 74 standard syntax table, while specifying other characters itself. The
+ − 75 ``inherit'' syntax class means ``inherit this character's syntax from
+ − 76 the standard syntax table.'' Most major modes' syntax tables inherit
+ − 77 the syntax of character codes 0 through 31 and 128 through 255. This is
+ − 78 useful with character sets such as ISO Latin-1 that have additional
+ − 79 alphabetic characters in the range 128 to 255. Just changing the
+ − 80 standard syntax for these characters affects all major modes.
+ − 81
+ − 82 @defun syntax-table-p object
+ − 83 This function returns @code{t} if @var{object} is a vector of length 256
+ − 84 elements. This means that the vector may be a syntax table. However,
+ − 85 according to this test, any vector of length 256 is considered to be a
+ − 86 syntax table, no matter what its contents.
+ − 87 @end defun
+ − 88
+ − 89 @node Syntax Descriptors
+ − 90 @section Syntax Descriptors
+ − 91 @cindex syntax classes
+ − 92
+ − 93 This section describes the syntax classes and flags that denote the
+ − 94 syntax of a character, and how they are represented as a @dfn{syntax
+ − 95 descriptor}, which is a Lisp string that you pass to
+ − 96 @code{modify-syntax-entry} to specify the desired syntax.
+ − 97
+ − 98 XEmacs defines a number of @dfn{syntax classes}. Each syntax table
+ − 99 puts each character into one class. There is no necessary relationship
+ − 100 between the class of a character in one syntax table and its class in
+ − 101 any other table.
+ − 102
+ − 103 Each class is designated by a mnemonic character, which serves as the
+ − 104 name of the class when you need to specify a class. Usually the
+ − 105 designator character is one that is frequently in that class; however,
+ − 106 its meaning as a designator is unvarying and independent of what syntax
+ − 107 that character currently has.
+ − 108
+ − 109 @cindex syntax descriptor
+ − 110 A syntax descriptor is a Lisp string that specifies a syntax class, a
+ − 111 matching character (used only for the parenthesis classes) and flags.
+ − 112 The first character is the designator for a syntax class. The second
+ − 113 character is the character to match; if it is unused, put a space there.
+ − 114 Then come the characters for any desired flags. If no matching
+ − 115 character or flags are needed, one character is sufficient.
+ − 116
+ − 117 For example, the descriptor for the character @samp{*} in C mode is
+ − 118 @samp{@w{. 23}} (i.e., punctuation, matching character slot unused,
+ − 119 second character of a comment-starter, first character of an
+ − 120 comment-ender), and the entry for @samp{/} is @samp{@w{. 14}} (i.e.,
+ − 121 punctuation, matching character slot unused, first character of a
+ − 122 comment-starter, second character of a comment-ender).
+ − 123
+ − 124 @menu
+ − 125 * Syntax Class Table:: Table of syntax classes.
+ − 126 * Syntax Flags:: Additional flags each character can have.
+ − 127 @end menu
+ − 128
+ − 129 @node Syntax Class Table
+ − 130 @subsection Table of Syntax Classes
+ − 131
+ − 132 Here is a table of syntax classes, the characters that stand for them,
+ − 133 their meanings, and examples of their use.
+ − 134
+ − 135 @deffn {Syntax class} @w{whitespace character}
1024
+ − 136 @dfn{Whitespace characters} (designated with @samp{-})
0
+ − 137 separate symbols and words from each other. Typically, whitespace
+ − 138 characters have no other syntactic significance, and multiple whitespace
+ − 139 characters are syntactically equivalent to a single one. Space, tab,
1024
+ − 140 newline and formfeed are almost always classified as whitespace. (The
+ − 141 designator @w{@samp{@ }} is accepted for backwards compatibility with
+ − 142 older versions of XEmacs, but is deprecated. It is invalid in GNU Emacs.)
0
+ − 143 @end deffn
+ − 144
+ − 145 @deffn {Syntax class} @w{word constituent}
+ − 146 @dfn{Word constituents} (designated with @samp{w}) are parts of normal
+ − 147 English words and are typically used in variable and command names in
+ − 148 programs. All upper- and lower-case letters, and the digits, are typically
+ − 149 word constituents.
+ − 150 @end deffn
+ − 151
+ − 152 @deffn {Syntax class} @w{symbol constituent}
+ − 153 @dfn{Symbol constituents} (designated with @samp{_}) are the extra
+ − 154 characters that are used in variable and command names along with word
+ − 155 constituents. For example, the symbol constituents class is used in
+ − 156 Lisp mode to indicate that certain characters may be part of symbol
+ − 157 names even though they are not part of English words. These characters
+ − 158 are @samp{$&*+-_<>}. In standard C, the only non-word-constituent
+ − 159 character that is valid in symbols is underscore (@samp{_}).
+ − 160 @end deffn
+ − 161
+ − 162 @deffn {Syntax class} @w{punctuation character}
+ − 163 @dfn{Punctuation characters} (@samp{.}) are those characters that are
+ − 164 used as punctuation in English, or are used in some way in a programming
+ − 165 language to separate symbols from one another. Most programming
+ − 166 language modes, including Emacs Lisp mode, have no characters in this
+ − 167 class since the few characters that are not symbol or word constituents
+ − 168 all have other uses.
+ − 169 @end deffn
+ − 170
+ − 171 @deffn {Syntax class} @w{open parenthesis character}
+ − 172 @deffnx {Syntax class} @w{close parenthesis character}
+ − 173 @cindex parenthesis syntax
+ − 174 Open and close @dfn{parenthesis characters} are characters used in
+ − 175 dissimilar pairs to surround sentences or expressions. Such a grouping
+ − 176 is begun with an open parenthesis character and terminated with a close.
+ − 177 Each open parenthesis character matches a particular close parenthesis
+ − 178 character, and vice versa. Normally, XEmacs indicates momentarily the
+ − 179 matching open parenthesis when you insert a close parenthesis.
+ − 180 @xref{Blinking}.
+ − 181
+ − 182 The class of open parentheses is designated with @samp{(}, and that of
+ − 183 close parentheses with @samp{)}.
+ − 184
+ − 185 In English text, and in C code, the parenthesis pairs are @samp{()},
+ − 186 @samp{[]}, and @samp{@{@}}. In XEmacs Lisp, the delimiters for lists and
+ − 187 vectors (@samp{()} and @samp{[]}) are classified as parenthesis
+ − 188 characters.
+ − 189 @end deffn
+ − 190
+ − 191 @deffn {Syntax class} @w{string quote}
+ − 192 @dfn{String quote characters} (designated with @samp{"}) are used in
+ − 193 many languages, including Lisp and C, to delimit string constants. The
+ − 194 same string quote character appears at the beginning and the end of a
+ − 195 string. Such quoted strings do not nest.
+ − 196
+ − 197 The parsing facilities of XEmacs consider a string as a single token.
+ − 198 The usual syntactic meanings of the characters in the string are
+ − 199 suppressed.
+ − 200
+ − 201 The Lisp modes have two string quote characters: double-quote (@samp{"})
+ − 202 and vertical bar (@samp{|}). @samp{|} is not used in XEmacs Lisp, but it
+ − 203 is used in Common Lisp. C also has two string quote characters:
+ − 204 double-quote for strings, and single-quote (@samp{'}) for character
+ − 205 constants.
+ − 206
+ − 207 English text has no string quote characters because English is not a
+ − 208 programming language. Although quotation marks are used in English,
+ − 209 we do not want them to turn off the usual syntactic properties of
+ − 210 other characters in the quotation.
+ − 211 @end deffn
+ − 212
+ − 213 @deffn {Syntax class} @w{escape}
+ − 214 An @dfn{escape character} (designated with @samp{\}) starts an escape
+ − 215 sequence such as is used in C string and character constants. The
+ − 216 character @samp{\} belongs to this class in both C and Lisp. (In C, it
+ − 217 is used thus only inside strings, but it turns out to cause no trouble
+ − 218 to treat it this way throughout C code.)
+ − 219
+ − 220 Characters in this class count as part of words if
+ − 221 @code{words-include-escapes} is non-@code{nil}. @xref{Word Motion}.
+ − 222 @end deffn
+ − 223
+ − 224 @deffn {Syntax class} @w{character quote}
+ − 225 A @dfn{character quote character} (designated with @samp{/}) quotes the
+ − 226 following character so that it loses its normal syntactic meaning. This
+ − 227 differs from an escape character in that only the character immediately
+ − 228 following is ever affected.
+ − 229
+ − 230 Characters in this class count as part of words if
+ − 231 @code{words-include-escapes} is non-@code{nil}. @xref{Word Motion}.
+ − 232
+ − 233 This class is used for backslash in @TeX{} mode.
+ − 234 @end deffn
+ − 235
+ − 236 @deffn {Syntax class} @w{paired delimiter}
+ − 237 @dfn{Paired delimiter characters} (designated with @samp{$}) are like
+ − 238 string quote characters except that the syntactic properties of the
+ − 239 characters between the delimiters are not suppressed. Only @TeX{} mode
+ − 240 uses a paired delimiter presently---the @samp{$} that both enters and
+ − 241 leaves math mode.
+ − 242 @end deffn
+ − 243
+ − 244 @deffn {Syntax class} @w{expression prefix}
+ − 245 An @dfn{expression prefix operator} (designated with @samp{'}) is used
+ − 246 for syntactic operators that are part of an expression if they appear
+ − 247 next to one. These characters in Lisp include the apostrophe, @samp{'}
+ − 248 (used for quoting), the comma, @samp{,} (used in macros), and @samp{#}
+ − 249 (used in the read syntax for certain data types).
+ − 250 @end deffn
+ − 251
+ − 252 @deffn {Syntax class} @w{comment starter}
+ − 253 @deffnx {Syntax class} @w{comment ender}
+ − 254 @cindex comment syntax
+ − 255 The @dfn{comment starter} and @dfn{comment ender} characters are used in
+ − 256 various languages to delimit comments. These classes are designated
+ − 257 with @samp{<} and @samp{>}, respectively.
+ − 258
+ − 259 English text has no comment characters. In Lisp, the semicolon
+ − 260 (@samp{;}) starts a comment and a newline or formfeed ends one.
+ − 261 @end deffn
+ − 262
+ − 263 @deffn {Syntax class} @w{inherit}
+ − 264 This syntax class does not specify a syntax. It says to look in the
+ − 265 standard syntax table to find the syntax of this character. The
+ − 266 designator for this syntax code is @samp{@@}.
+ − 267 @end deffn
+ − 268
+ − 269 @node Syntax Flags
+ − 270 @subsection Syntax Flags
+ − 271 @cindex syntax flags
+ − 272
1024
+ − 273 @c This is a bit inaccurate, the ``a'' and ``b'' flags actually don't
+ − 274 @c exist in the internal implementation. AFAICT it doesn't affect the
+ − 275 @c semantics as perceived by the LISP programmer.
0
+ − 276 In addition to the classes, entries for characters in a syntax table
1024
+ − 277 can include flags. There are eleven possible flags, represented by the
+ − 278 digits @samp{1}--@samp{8}, and the lowercase letters @samp{a}, @samp{b},
+ − 279 and @samp{p}.
0
+ − 280
1024
+ − 281 All the flags except @samp{p} are used to describe comment delimiters.
+ − 282 The digit flags indicate that a character can @emph{also} be part of a
+ − 283 multi-character comment sequence, in addition to the syntactic
+ − 284 properties associated with its character class. The flags must be
0
+ − 285 independent of the class and each other for the sake of characters such
+ − 286 as @samp{*} in C mode, which is a punctuation character, @emph{and} the
+ − 287 second character of a start-of-comment sequence (@samp{/*}), @emph{and}
+ − 288 the first character of an end-of-comment sequence (@samp{*/}).
+ − 289
+ − 290 Emacs supports two comment styles simultaneously in any one syntax
+ − 291 table. This is for the sake of C++. Each style of comment syntax has
+ − 292 its own comment-start sequence and its own comment-end sequence. Each
+ − 293 comment must stick to one style or the other; thus, if it starts with
+ − 294 the comment-start sequence of style ``b'', it must also end with the
+ − 295 comment-end sequence of style ``b''.
+ − 296
1024
+ − 297 @c #### Compatibility note; index here.
+ − 298 As an extension to GNU Emacs 19 and 20, XEmacs supports two arbitrary
+ − 299 comment-start sequences and two arbitrary comment-end sequences. (Thus
+ − 300 the need for 8 flags.) GNU Emacs restricts the comment-start sequences
+ − 301 to start with the same character, XEmacs does not. This means that for
+ − 302 two-character sequences, where GNU Emacs uses the @samp{b} flag, XEmacs
+ − 303 uses the digit flags @samp{5}--@samp{8}.
0
+ − 304
1024
+ − 305 A one character comment-end sequence applies to the ``b'' style if its
+ − 306 first character has the @samp{b} flag set; otherwise, it applies to the
+ − 307 ``a'' style. The @samp{a} flag is optional. These flags have no effect
+ − 308 on non-comment characters; two-character styles are determined by the
+ − 309 digit flags.
+ − 310
+ − 311 The flags for a character @var{c} are:
0
+ − 312
1024
+ − 313 @itemize @bullet
+ − 314 @item
+ − 315 @samp{1} means @var{c} is the start of a two-character comment-start
+ − 316 sequence of style ``a''.
+ − 317
+ − 318 @item
+ − 319 @samp{2} means @var{c} is the second character of such a sequence.
0
+ − 320
1024
+ − 321 @item
+ − 322 @samp{3} means @var{c} is the start of a two-character comment-end
+ − 323 sequence of style ``a''.
+ − 324
+ − 325 @item
+ − 326 @samp{4} means @var{c} is the second character of such a sequence.
0
+ − 327
1024
+ − 328 @item
+ − 329 @samp{5} means @var{c} is the start of a two-character comment-start
+ − 330 sequence of style ``b''.
+ − 331
+ − 332 @item
+ − 333 @samp{6} means @var{c} is the second character of such a sequence.
0
+ − 334
1024
+ − 335 @item
+ − 336 @samp{7} means @var{c} is the start of a two-character comment-end
+ − 337 sequence of style ``b''.
+ − 338
+ − 339 @item
+ − 340 @samp{8} means @var{c} is the second character of such a sequence.
0
+ − 341
1024
+ − 342 @item
+ − 343 @samp{a} means that @var{c} as a comment delimiter belongs to the
+ − 344 default ``a'' comment style. (This flag is optional.)
0
+ − 345
1024
+ − 346 @item
+ − 347 @c Emacs 19 feature
+ − 348 @samp{b} means that @var{c} as a comment delimiter belongs to the
+ − 349 alternate ``b'' comment style.
0
+ − 350
+ − 351 @item
+ − 352 @c Emacs 19 feature
+ − 353 @samp{p} identifies an additional ``prefix character'' for Lisp syntax.
+ − 354 These characters are treated as whitespace when they appear between
+ − 355 expressions. When they appear within an expression, they are handled
+ − 356 according to their usual syntax codes.
+ − 357
+ − 358 The function @code{backward-prefix-chars} moves back over these
+ − 359 characters, as well as over characters whose primary syntax class is
+ − 360 prefix (@samp{'}). @xref{Motion and Syntax}.
+ − 361 @end itemize
+ − 362
1024
+ − 363 Lisp (as you would expect) has a simple comment syntax.
+ − 364
+ − 365 @table @asis
+ − 366 @item @samp{;}
+ − 367 @samp{<}
+ − 368 @item newline
+ − 369 @samp{>}
+ − 370 @end table
+ − 371
+ − 372 Note that no flags are used.
+ − 373 This defines two comment-delimiting sequences:
+ − 374
+ − 375 @table @asis
+ − 376 @item @samp{;}
+ − 377 This is a single-character comment-start sequence because the syntax
+ − 378 class is @samp{<}.
+ − 379
+ − 380 @item newline
+ − 381 This is a single character comment-end sequence because the syntax class
+ − 382 is @samp{>} and the @samp{b} flag is not set.
+ − 383 @end table
+ − 384
+ − 385 C++ (again, as you would expect) has a baroque, overrich, and
+ − 386 excessively complex comment syntax.
+ − 387
+ − 388 @table @asis
+ − 389 @item @samp{/}
+ − 390 @samp{1456}
+ − 391 @item @samp{*}
+ − 392 @samp{23}
+ − 393 @item newline
+ − 394 @samp{>b}
+ − 395 @end table
+ − 396
+ − 397 Note that the ``b'' style mixes one-character and two-character
+ − 398 sequences. The table above defines four comment-delimiting sequences:
+ − 399
+ − 400 @table @asis
+ − 401 @item @samp{/*}
+ − 402 This is a comment-start sequence for ``a'' style because the @samp{1}
+ − 403 flag is set on @samp{/} and the @samp{2} flag is set on @samp{*}.
+ − 404
+ − 405 @item @samp{//}
+ − 406 This is a comment-start sequence for ``b'' style because both the @samp{5}
+ − 407 and the @samp{6} flags are set on @samp{/}.
+ − 408
+ − 409 @item @samp{*/}
+ − 410 This is a comment-end sequence for ``a'' style because the @samp{3}
+ − 411 flag is set on @samp{*} and the @samp{4} flag is set on @samp{/}.
+ − 412
+ − 413 @item newline
+ − 414 This is a comment-end sequence for ``b'' style, because the newline
+ − 415 character has the @samp{b} flag.
+ − 416 @end table
+ − 417
+ − 418
0
+ − 419 @node Syntax Table Functions
+ − 420 @section Syntax Table Functions
+ − 421
+ − 422 In this section we describe functions for creating, accessing and
+ − 423 altering syntax tables.
+ − 424
444
+ − 425 @defun make-syntax-table &optional oldtable
0
+ − 426 This function creates a new syntax table. Character codes 0 through
+ − 427 31 and 128 through 255 are set up to inherit from the standard syntax
+ − 428 table. The other character codes are set up by copying what the
+ − 429 standard syntax table says about them.
+ − 430
+ − 431 Most major mode syntax tables are created in this way.
+ − 432 @end defun
+ − 433
444
+ − 434 @defun copy-syntax-table &optional syntax-table
+ − 435 This function constructs a copy of @var{syntax-table} and returns it.
+ − 436 If @var{syntax-table} is not supplied (or is @code{nil}), it returns a
+ − 437 copy of the current syntax table. Otherwise, an error is signaled if
+ − 438 @var{syntax-table} is not a syntax table.
0
+ − 439 @end defun
+ − 440
444
+ − 441 @deffn Command modify-syntax-entry char-range syntax-descriptor &optional syntax-table
+ − 442 This function sets the syntax entry for @var{char-range} according to
+ − 443 @var{syntax-descriptor}. @var{char-range} is either a single character
+ − 444 or a range of characters, as used with @code{put-char-table}. The syntax
+ − 445 is changed only for @var{syntax-table}, which defaults to the current
+ − 446 buffer's syntax table, and not in any other syntax table. The argument
+ − 447 @var{syntax-descriptor} specifies the desired syntax; this is a string
+ − 448 beginning with a class designator character, and optionally containing a
+ − 449 matching character and flags as well. @xref{Syntax Descriptors}.
0
+ − 450
+ − 451 This function always returns @code{nil}. The old syntax information in
444
+ − 452 the table for @var{char-range} is discarded.
0
+ − 453
+ − 454 An error is signaled if the first character of the syntax descriptor is not
444
+ − 455 one of the twelve syntax class designator characters.
0
+ − 456
+ − 457 @example
+ − 458 @group
+ − 459 @exdent @r{Examples:}
+ − 460
+ − 461 ;; @r{Put the space character in class whitespace.}
+ − 462 (modify-syntax-entry ?\ " ")
+ − 463 @result{} nil
+ − 464 @end group
+ − 465
+ − 466 @group
+ − 467 ;; @r{Make @samp{$} an open parenthesis character,}
+ − 468 ;; @r{with @samp{^} as its matching close.}
+ − 469 (modify-syntax-entry ?$ "(^")
+ − 470 @result{} nil
+ − 471 @end group
+ − 472
+ − 473 @group
+ − 474 ;; @r{Make @samp{^} a close parenthesis character,}
+ − 475 ;; @r{with @samp{$} as its matching open.}
+ − 476 (modify-syntax-entry ?^ ")$")
+ − 477 @result{} nil
+ − 478 @end group
+ − 479
+ − 480 @group
+ − 481 ;; @r{Make @samp{/} a punctuation character,}
+ − 482 ;; @r{the first character of a start-comment sequence,}
+ − 483 ;; @r{and the second character of an end-comment sequence.}
+ − 484 ;; @r{This is used in C mode.}
+ − 485 (modify-syntax-entry ?/ ". 14")
+ − 486 @result{} nil
+ − 487 @end group
+ − 488 @end example
+ − 489 @end deffn
+ − 490
444
+ − 491 @defun char-syntax character &optional syntax-table
0
+ − 492 This function returns the syntax class of @var{character}, represented
+ − 493 by its mnemonic designator character. This @emph{only} returns the
+ − 494 class, not any matching parenthesis or flags.
+ − 495
444
+ − 496 An error is signaled if @var{character} is not a character.
+ − 497
+ − 498 The characters that correspond to various syntax codes
+ − 499 are listed in the documentation of @code{modify-syntax-entry}.
+ − 500
+ − 501 Optional second argument @var{syntax-table} is the syntax table to be
+ − 502 used, and defaults to the current buffer's syntax table.
0
+ − 503
+ − 504 The following examples apply to C mode. The first example shows that
+ − 505 the syntax class of space is whitespace (represented by a space). The
+ − 506 second example shows that the syntax of @samp{/} is punctuation. This
+ − 507 does not show the fact that it is also part of comment-start and -end
+ − 508 sequences. The third example shows that open parenthesis is in the class
+ − 509 of open parentheses. This does not show the fact that it has a matching
+ − 510 character, @samp{)}.
+ − 511
+ − 512 @example
+ − 513 @group
+ − 514 (char-to-string (char-syntax ?\ ))
+ − 515 @result{} " "
+ − 516 @end group
+ − 517
+ − 518 @group
+ − 519 (char-to-string (char-syntax ?/))
+ − 520 @result{} "."
+ − 521 @end group
+ − 522
+ − 523 @group
+ − 524 (char-to-string (char-syntax ?\())
+ − 525 @result{} "("
+ − 526 @end group
+ − 527 @end example
+ − 528 @end defun
+ − 529
444
+ − 530 @defun set-syntax-table syntax-table &optional buffer
+ − 531 This function makes @var{syntax-table} the syntax table for @var{buffer}, which
+ − 532 defaults to the current buffer if omitted. It returns @var{syntax-table}.
0
+ − 533 @end defun
+ − 534
+ − 535 @defun syntax-table &optional buffer
+ − 536 This function returns the syntax table for @var{buffer}, which defaults
+ − 537 to the current buffer if omitted.
+ − 538 @end defun
+ − 539
+ − 540 @node Motion and Syntax
+ − 541 @section Motion and Syntax
+ − 542
+ − 543 This section describes functions for moving across characters in
+ − 544 certain syntax classes. None of these functions exists in Emacs
+ − 545 version 18 or earlier.
+ − 546
+ − 547 @defun skip-syntax-forward syntaxes &optional limit buffer
+ − 548 This function moves point forward across characters having syntax classes
+ − 549 mentioned in @var{syntaxes}. It stops when it encounters the end of
+ − 550 the buffer, or position @var{limit} (if specified), or a character it is
+ − 551 not supposed to skip. Optional argument @var{buffer} defaults to the
+ − 552 current buffer if omitted.
+ − 553 @ignore @c may want to change this.
+ − 554 The return value is the distance traveled, which is a nonnegative
+ − 555 integer.
+ − 556 @end ignore
+ − 557 @end defun
+ − 558
+ − 559 @defun skip-syntax-backward syntaxes &optional limit buffer
+ − 560 This function moves point backward across characters whose syntax
+ − 561 classes are mentioned in @var{syntaxes}. It stops when it encounters
+ − 562 the beginning of the buffer, or position @var{limit} (if specified), or a
+ − 563 character it is not supposed to skip. Optional argument @var{buffer}
+ − 564 defaults to the current buffer if omitted.
+ − 565
+ − 566 @ignore @c may want to change this.
+ − 567 The return value indicates the distance traveled. It is an integer that
+ − 568 is zero or less.
+ − 569 @end ignore
+ − 570 @end defun
+ − 571
+ − 572 @defun backward-prefix-chars &optional buffer
+ − 573 This function moves point backward over any number of characters with
+ − 574 expression prefix syntax. This includes both characters in the
+ − 575 expression prefix syntax class, and characters with the @samp{p} flag.
+ − 576 Optional argument @var{buffer} defaults to the current buffer if
+ − 577 omitted.
+ − 578 @end defun
+ − 579
+ − 580 @node Parsing Expressions
+ − 581 @section Parsing Balanced Expressions
+ − 582
+ − 583 Here are several functions for parsing and scanning balanced
+ − 584 expressions, also known as @dfn{sexps}, in which parentheses match in
+ − 585 pairs. The syntax table controls the interpretation of characters, so
+ − 586 these functions can be used for Lisp expressions when in Lisp mode and
+ − 587 for C expressions when in C mode. @xref{List Motion}, for convenient
+ − 588 higher-level functions for moving over balanced expressions.
+ − 589
+ − 590 @defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment buffer
+ − 591 This function parses a sexp in the current buffer starting at
+ − 592 @var{start}, not scanning past @var{limit}. It stops at position
+ − 593 @var{limit} or when certain criteria described below are met, and sets
+ − 594 point to the location where parsing stops. It returns a value
+ − 595 describing the status of the parse at the point where it stops.
+ − 596
+ − 597 If @var{state} is @code{nil}, @var{start} is assumed to be at the top
+ − 598 level of parenthesis structure, such as the beginning of a function
+ − 599 definition. Alternatively, you might wish to resume parsing in the
+ − 600 middle of the structure. To do this, you must provide a @var{state}
+ − 601 argument that describes the initial status of parsing.
+ − 602
+ − 603 @cindex parenthesis depth
+ − 604 If the third argument @var{target-depth} is non-@code{nil}, parsing
+ − 605 stops if the depth in parentheses becomes equal to @var{target-depth}.
+ − 606 The depth starts at 0, or at whatever is given in @var{state}.
+ − 607
+ − 608 If the fourth argument @var{stop-before} is non-@code{nil}, parsing
+ − 609 stops when it comes to any character that starts a sexp. If
+ − 610 @var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
+ − 611 start of a comment.
+ − 612
+ − 613 @cindex parse state
+ − 614 The fifth argument @var{state} is an eight-element list of the same
+ − 615 form as the value of this function, described below. The return value
+ − 616 of one call may be used to initialize the state of the parse on another
+ − 617 call to @code{parse-partial-sexp}.
+ − 618
+ − 619 The result is a list of eight elements describing the final state of
+ − 620 the parse:
+ − 621
+ − 622 @enumerate 0
444
+ − 623 @item
0
+ − 624 The depth in parentheses, counting from 0.
+ − 625
444
+ − 626 @item
0
+ − 627 @cindex innermost containing parentheses
+ − 628 The character position of the start of the innermost parenthetical
+ − 629 grouping containing the stopping point; @code{nil} if none.
+ − 630
444
+ − 631 @item
0
+ − 632 @cindex previous complete subexpression
+ − 633 The character position of the start of the last complete subexpression
+ − 634 terminated; @code{nil} if none.
+ − 635
444
+ − 636 @item
0
+ − 637 @cindex inside string
+ − 638 Non-@code{nil} if inside a string. More precisely, this is the
+ − 639 character that will terminate the string.
+ − 640
444
+ − 641 @item
0
+ − 642 @cindex inside comment
+ − 643 @code{t} if inside a comment (of either style).
+ − 644
444
+ − 645 @item
0
+ − 646 @cindex quote character
+ − 647 @code{t} if point is just after a quote character.
+ − 648
444
+ − 649 @item
0
+ − 650 The minimum parenthesis depth encountered during this scan.
+ − 651
+ − 652 @item
+ − 653 @code{t} if inside a comment of style ``b''.
+ − 654 @end enumerate
+ − 655
+ − 656 Elements 0, 3, 4, 5 and 7 are significant in the argument @var{state}.
+ − 657
+ − 658 @cindex indenting with parentheses
+ − 659 This function is most often used to compute indentation for languages
+ − 660 that have nested parentheses.
+ − 661 @end defun
+ − 662
+ − 663 @defun scan-lists from count depth &optional buffer noerror
+ − 664 This function scans forward @var{count} balanced parenthetical groupings
+ − 665 from character number @var{from}. It returns the character position
+ − 666 where the scan stops.
+ − 667
+ − 668 If @var{depth} is nonzero, parenthesis depth counting begins from that
+ − 669 value. The only candidates for stopping are places where the depth in
+ − 670 parentheses becomes zero; @code{scan-lists} counts @var{count} such
+ − 671 places and then stops. Thus, a positive value for @var{depth} means go
+ − 672 out @var{depth} levels of parenthesis.
+ − 673
+ − 674 Scanning ignores comments if @code{parse-sexp-ignore-comments} is
+ − 675 non-@code{nil}.
+ − 676
+ − 677 If the scan reaches the beginning or end of the buffer (or its
+ − 678 accessible portion), and the depth is not zero, an error is signaled.
+ − 679 If the depth is zero but the count is not used up, @code{nil} is
+ − 680 returned.
+ − 681
+ − 682 If optional arg @var{buffer} is non-@code{nil}, scanning occurs in that
+ − 683 buffer instead of in the current buffer.
+ − 684
+ − 685 If optional arg @var{noerror} is non-@code{nil}, @code{scan-lists}
+ − 686 will return @code{nil} instead of signalling an error.
+ − 687 @end defun
+ − 688
+ − 689 @defun scan-sexps from count &optional buffer noerror
+ − 690 This function scans forward @var{count} sexps from character position
+ − 691 @var{from}. It returns the character position where the scan stops.
+ − 692
+ − 693 Scanning ignores comments if @code{parse-sexp-ignore-comments} is
+ − 694 non-@code{nil}.
+ − 695
+ − 696 If the scan reaches the beginning or end of (the accessible part of) the
+ − 697 buffer in the middle of a parenthetical grouping, an error is signaled.
+ − 698 If it reaches the beginning or end between groupings but before count is
+ − 699 used up, @code{nil} is returned.
+ − 700
+ − 701 If optional arg @var{buffer} is non-@code{nil}, scanning occurs in
+ − 702 that buffer instead of in the current buffer.
+ − 703
+ − 704 If optional arg @var{noerror} is non-@code{nil}, @code{scan-sexps}
+ − 705 will return nil instead of signalling an error.
+ − 706 @end defun
+ − 707
+ − 708 @defvar parse-sexp-ignore-comments
+ − 709 @cindex skipping comments
+ − 710 If the value is non-@code{nil}, then comments are treated as
+ − 711 whitespace by the functions in this section and by @code{forward-sexp}.
+ − 712
+ − 713 In older Emacs versions, this feature worked only when the comment
+ − 714 terminator is something like @samp{*/}, and appears only to end a
+ − 715 comment. In languages where newlines terminate comments, it was
+ − 716 necessary make this variable @code{nil}, since not every newline is the
+ − 717 end of a comment. This limitation no longer exists.
+ − 718 @end defvar
+ − 719
+ − 720 You can use @code{forward-comment} to move forward or backward over
+ − 721 one comment or several comments.
+ − 722
446
+ − 723 @defun forward-comment &optional count buffer
0
+ − 724 This function moves point forward across @var{count} comments (backward,
+ − 725 if @var{count} is negative). If it finds anything other than a comment
+ − 726 or whitespace, it stops, leaving point at the place where it stopped.
446
+ − 727 It also stops after satisfying @var{count}. @var{count} defaults to @code{1}.
0
+ − 728
446
+ − 729 Optional argument @var{buffer} defaults to the current buffer.
0
+ − 730 @end defun
+ − 731
+ − 732 To move forward over all comments and whitespace following point, use
+ − 733 @code{(forward-comment (buffer-size))}. @code{(buffer-size)} is a good
+ − 734 argument to use, because the number of comments in the buffer cannot
+ − 735 exceed that many.
+ − 736
+ − 737 @node Standard Syntax Tables
+ − 738 @section Some Standard Syntax Tables
+ − 739
+ − 740 Most of the major modes in XEmacs have their own syntax tables. Here
+ − 741 are several of them:
+ − 742
+ − 743 @defun standard-syntax-table
+ − 744 This function returns the standard syntax table, which is the syntax
+ − 745 table used in Fundamental mode.
+ − 746 @end defun
+ − 747
+ − 748 @defvar text-mode-syntax-table
+ − 749 The value of this variable is the syntax table used in Text mode.
+ − 750 @end defvar
+ − 751
+ − 752 @defvar c-mode-syntax-table
+ − 753 The value of this variable is the syntax table for C-mode buffers.
+ − 754 @end defvar
+ − 755
+ − 756 @defvar emacs-lisp-mode-syntax-table
+ − 757 The value of this variable is the syntax table used in Emacs Lisp mode
+ − 758 by editing commands. (It has no effect on the Lisp @code{read}
+ − 759 function.)
+ − 760 @end defvar
+ − 761
+ − 762 @node Syntax Table Internals
+ − 763 @section Syntax Table Internals
+ − 764 @cindex syntax table internals
+ − 765
+ − 766 Each element of a syntax table is an integer that encodes the syntax
+ − 767 of one character: the syntax class, possible matching character, and
+ − 768 flags. Lisp programs don't usually work with the elements directly; the
+ − 769 Lisp-level syntax table functions usually work with syntax descriptors
+ − 770 (@pxref{Syntax Descriptors}).
+ − 771
+ − 772 The low 8 bits of each element of a syntax table indicate the
+ − 773 syntax class.
+ − 774
+ − 775 @table @asis
+ − 776 @item @i{Integer}
+ − 777 @i{Class}
+ − 778 @item 0
+ − 779 whitespace
+ − 780 @item 1
+ − 781 punctuation
+ − 782 @item 2
+ − 783 word
+ − 784 @item 3
+ − 785 symbol
+ − 786 @item 4
+ − 787 open parenthesis
+ − 788 @item 5
+ − 789 close parenthesis
+ − 790 @item 6
+ − 791 expression prefix
+ − 792 @item 7
+ − 793 string quote
+ − 794 @item 8
+ − 795 paired delimiter
+ − 796 @item 9
+ − 797 escape
+ − 798 @item 10
+ − 799 character quote
+ − 800 @item 11
+ − 801 comment-start
+ − 802 @item 12
+ − 803 comment-end
+ − 804 @item 13
+ − 805 inherit
+ − 806 @end table
+ − 807
+ − 808 The next 8 bits are the matching opposite parenthesis (if the
+ − 809 character has parenthesis syntax); otherwise, they are not meaningful.
+ − 810 The next 6 bits are the flags.