0
|
1 @c -*-texinfo-*-
|
|
2 @c This is part of the XEmacs Lisp Reference Manual.
|
444
|
3 @c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc.
|
0
|
4 @c See the file lispref.texi for copying conditions.
|
|
5 @setfilename ../../info/syntax.info
|
|
6 @node Syntax Tables, Abbrevs, Searching and Matching, Top
|
|
7 @chapter Syntax Tables
|
|
8 @cindex parsing
|
|
9 @cindex syntax table
|
|
10 @cindex text parsing
|
|
11
|
|
12 A @dfn{syntax table} specifies the syntactic textual function of each
|
|
13 character. This information is used by the parsing commands, the
|
|
14 complex movement commands, and others to determine where words, symbols,
|
|
15 and other syntactic constructs begin and end. The current syntax table
|
|
16 controls the meaning of the word motion functions (@pxref{Word Motion})
|
|
17 and the list motion functions (@pxref{List Motion}) as well as the
|
|
18 functions in this chapter.
|
|
19
|
|
20 @menu
|
|
21 * Basics: Syntax Basics. Basic concepts of syntax tables.
|
|
22 * Desc: Syntax Descriptors. How characters are classified.
|
|
23 * Syntax Table Functions:: How to create, examine and alter syntax tables.
|
|
24 * Motion and Syntax:: Moving over characters with certain syntaxes.
|
|
25 * Parsing Expressions:: Parsing balanced expressions
|
|
26 using the syntax table.
|
|
27 * Standard Syntax Tables:: Syntax tables used by various major modes.
|
|
28 * Syntax Table Internals:: How syntax table information is stored.
|
|
29 @end menu
|
|
30
|
|
31 @node Syntax Basics
|
|
32 @section Syntax Table Concepts
|
|
33
|
|
34 @ifinfo
|
|
35 A @dfn{syntax table} provides Emacs with the information that
|
|
36 determines the syntactic use of each character in a buffer. This
|
|
37 information is used by the parsing commands, the complex movement
|
|
38 commands, and others to determine where words, symbols, and other
|
|
39 syntactic constructs begin and end. The current syntax table controls
|
|
40 the meaning of the word motion functions (@pxref{Word Motion}) and the
|
|
41 list motion functions (@pxref{List Motion}) as well as the functions in
|
|
42 this chapter.
|
|
43 @end ifinfo
|
|
44
|
1024
|
45 Under XEmacs 20 and later, a syntax table is a particular subtype of the
|
0
|
46 primitive char table type (@pxref{Char Tables}), and each element of the
|
|
47 char table is an integer that encodes the syntax of the character in
|
|
48 question, or a cons of such an integer and a matching character (for
|
|
49 characters with parenthesis syntax).
|
|
50
|
|
51 Under XEmacs 19, a syntax table is a vector of 256 elements; it
|
|
52 contains one entry for each of the 256 possible characters in an 8-bit
|
|
53 byte. Each element is an integer that encodes the syntax of the
|
|
54 character in question. (The matching character, if any, is embedded
|
|
55 in the bits of this integer.)
|
|
56
|
|
57 Syntax tables are used only for moving across text, not for the Emacs
|
|
58 Lisp reader. XEmacs Lisp uses built-in syntactic rules when reading Lisp
|
|
59 expressions, and these rules cannot be changed.
|
|
60
|
|
61 Each buffer has its own major mode, and each major mode has its own
|
|
62 idea of the syntactic class of various characters. For example, in Lisp
|
|
63 mode, the character @samp{;} begins a comment, but in C mode, it
|
|
64 terminates a statement. To support these variations, XEmacs makes the
|
|
65 choice of syntax table local to each buffer. Typically, each major
|
|
66 mode has its own syntax table and installs that table in each buffer
|
|
67 that uses that mode. Changing this table alters the syntax in all
|
|
68 those buffers as well as in any buffers subsequently put in that mode.
|
|
69 Occasionally several similar modes share one syntax table.
|
|
70 @xref{Example Major Modes}, for an example of how to set up a syntax
|
|
71 table.
|
|
72
|
|
73 A syntax table can inherit the data for some characters from the
|
|
74 standard syntax table, while specifying other characters itself. The
|
|
75 ``inherit'' syntax class means ``inherit this character's syntax from
|
|
76 the standard syntax table.'' Most major modes' syntax tables inherit
|
|
77 the syntax of character codes 0 through 31 and 128 through 255. This is
|
|
78 useful with character sets such as ISO Latin-1 that have additional
|
|
79 alphabetic characters in the range 128 to 255. Just changing the
|
|
80 standard syntax for these characters affects all major modes.
|
|
81
|
|
82 @defun syntax-table-p object
|
|
83 This function returns @code{t} if @var{object} is a vector of length 256
|
|
84 elements. This means that the vector may be a syntax table. However,
|
|
85 according to this test, any vector of length 256 is considered to be a
|
|
86 syntax table, no matter what its contents.
|
|
87 @end defun
|
|
88
|
|
89 @node Syntax Descriptors
|
|
90 @section Syntax Descriptors
|
|
91 @cindex syntax classes
|
|
92
|
|
93 This section describes the syntax classes and flags that denote the
|
|
94 syntax of a character, and how they are represented as a @dfn{syntax
|
|
95 descriptor}, which is a Lisp string that you pass to
|
|
96 @code{modify-syntax-entry} to specify the desired syntax.
|
|
97
|
|
98 XEmacs defines a number of @dfn{syntax classes}. Each syntax table
|
|
99 puts each character into one class. There is no necessary relationship
|
|
100 between the class of a character in one syntax table and its class in
|
|
101 any other table.
|
|
102
|
|
103 Each class is designated by a mnemonic character, which serves as the
|
|
104 name of the class when you need to specify a class. Usually the
|
|
105 designator character is one that is frequently in that class; however,
|
|
106 its meaning as a designator is unvarying and independent of what syntax
|
|
107 that character currently has.
|
|
108
|
|
109 @cindex syntax descriptor
|
|
110 A syntax descriptor is a Lisp string that specifies a syntax class, a
|
|
111 matching character (used only for the parenthesis classes) and flags.
|
|
112 The first character is the designator for a syntax class. The second
|
|
113 character is the character to match; if it is unused, put a space there.
|
|
114 Then come the characters for any desired flags. If no matching
|
|
115 character or flags are needed, one character is sufficient.
|
|
116
|
|
117 For example, the descriptor for the character @samp{*} in C mode is
|
|
118 @samp{@w{. 23}} (i.e., punctuation, matching character slot unused,
|
|
119 second character of a comment-starter, first character of an
|
|
120 comment-ender), and the entry for @samp{/} is @samp{@w{. 14}} (i.e.,
|
|
121 punctuation, matching character slot unused, first character of a
|
|
122 comment-starter, second character of a comment-ender).
|
|
123
|
|
124 @menu
|
|
125 * Syntax Class Table:: Table of syntax classes.
|
|
126 * Syntax Flags:: Additional flags each character can have.
|
|
127 @end menu
|
|
128
|
|
129 @node Syntax Class Table
|
|
130 @subsection Table of Syntax Classes
|
|
131
|
|
132 Here is a table of syntax classes, the characters that stand for them,
|
|
133 their meanings, and examples of their use.
|
|
134
|
|
135 @deffn {Syntax class} @w{whitespace character}
|
1024
|
136 @dfn{Whitespace characters} (designated with @samp{-})
|
0
|
137 separate symbols and words from each other. Typically, whitespace
|
|
138 characters have no other syntactic significance, and multiple whitespace
|
|
139 characters are syntactically equivalent to a single one. Space, tab,
|
1024
|
140 newline and formfeed are almost always classified as whitespace. (The
|
|
141 designator @w{@samp{@ }} is accepted for backwards compatibility with
|
|
142 older versions of XEmacs, but is deprecated. It is invalid in GNU Emacs.)
|
0
|
143 @end deffn
|
|
144
|
|
145 @deffn {Syntax class} @w{word constituent}
|
|
146 @dfn{Word constituents} (designated with @samp{w}) are parts of normal
|
|
147 English words and are typically used in variable and command names in
|
|
148 programs. All upper- and lower-case letters, and the digits, are typically
|
|
149 word constituents.
|
|
150 @end deffn
|
|
151
|
|
152 @deffn {Syntax class} @w{symbol constituent}
|
|
153 @dfn{Symbol constituents} (designated with @samp{_}) are the extra
|
|
154 characters that are used in variable and command names along with word
|
|
155 constituents. For example, the symbol constituents class is used in
|
|
156 Lisp mode to indicate that certain characters may be part of symbol
|
|
157 names even though they are not part of English words. These characters
|
|
158 are @samp{$&*+-_<>}. In standard C, the only non-word-constituent
|
|
159 character that is valid in symbols is underscore (@samp{_}).
|
|
160 @end deffn
|
|
161
|
|
162 @deffn {Syntax class} @w{punctuation character}
|
|
163 @dfn{Punctuation characters} (@samp{.}) are those characters that are
|
|
164 used as punctuation in English, or are used in some way in a programming
|
|
165 language to separate symbols from one another. Most programming
|
|
166 language modes, including Emacs Lisp mode, have no characters in this
|
|
167 class since the few characters that are not symbol or word constituents
|
|
168 all have other uses.
|
|
169 @end deffn
|
|
170
|
|
171 @deffn {Syntax class} @w{open parenthesis character}
|
|
172 @deffnx {Syntax class} @w{close parenthesis character}
|
|
173 @cindex parenthesis syntax
|
|
174 Open and close @dfn{parenthesis characters} are characters used in
|
|
175 dissimilar pairs to surround sentences or expressions. Such a grouping
|
|
176 is begun with an open parenthesis character and terminated with a close.
|
|
177 Each open parenthesis character matches a particular close parenthesis
|
|
178 character, and vice versa. Normally, XEmacs indicates momentarily the
|
|
179 matching open parenthesis when you insert a close parenthesis.
|
|
180 @xref{Blinking}.
|
|
181
|
|
182 The class of open parentheses is designated with @samp{(}, and that of
|
|
183 close parentheses with @samp{)}.
|
|
184
|
|
185 In English text, and in C code, the parenthesis pairs are @samp{()},
|
|
186 @samp{[]}, and @samp{@{@}}. In XEmacs Lisp, the delimiters for lists and
|
|
187 vectors (@samp{()} and @samp{[]}) are classified as parenthesis
|
|
188 characters.
|
|
189 @end deffn
|
|
190
|
|
191 @deffn {Syntax class} @w{string quote}
|
|
192 @dfn{String quote characters} (designated with @samp{"}) are used in
|
|
193 many languages, including Lisp and C, to delimit string constants. The
|
|
194 same string quote character appears at the beginning and the end of a
|
|
195 string. Such quoted strings do not nest.
|
|
196
|
|
197 The parsing facilities of XEmacs consider a string as a single token.
|
|
198 The usual syntactic meanings of the characters in the string are
|
|
199 suppressed.
|
|
200
|
|
201 The Lisp modes have two string quote characters: double-quote (@samp{"})
|
|
202 and vertical bar (@samp{|}). @samp{|} is not used in XEmacs Lisp, but it
|
|
203 is used in Common Lisp. C also has two string quote characters:
|
|
204 double-quote for strings, and single-quote (@samp{'}) for character
|
|
205 constants.
|
|
206
|
|
207 English text has no string quote characters because English is not a
|
|
208 programming language. Although quotation marks are used in English,
|
|
209 we do not want them to turn off the usual syntactic properties of
|
|
210 other characters in the quotation.
|
|
211 @end deffn
|
|
212
|
|
213 @deffn {Syntax class} @w{escape}
|
|
214 An @dfn{escape character} (designated with @samp{\}) starts an escape
|
|
215 sequence such as is used in C string and character constants. The
|
|
216 character @samp{\} belongs to this class in both C and Lisp. (In C, it
|
|
217 is used thus only inside strings, but it turns out to cause no trouble
|
|
218 to treat it this way throughout C code.)
|
|
219
|
|
220 Characters in this class count as part of words if
|
|
221 @code{words-include-escapes} is non-@code{nil}. @xref{Word Motion}.
|
|
222 @end deffn
|
|
223
|
|
224 @deffn {Syntax class} @w{character quote}
|
|
225 A @dfn{character quote character} (designated with @samp{/}) quotes the
|
|
226 following character so that it loses its normal syntactic meaning. This
|
|
227 differs from an escape character in that only the character immediately
|
|
228 following is ever affected.
|
|
229
|
|
230 Characters in this class count as part of words if
|
|
231 @code{words-include-escapes} is non-@code{nil}. @xref{Word Motion}.
|
|
232
|
|
233 This class is used for backslash in @TeX{} mode.
|
|
234 @end deffn
|
|
235
|
|
236 @deffn {Syntax class} @w{paired delimiter}
|
|
237 @dfn{Paired delimiter characters} (designated with @samp{$}) are like
|
|
238 string quote characters except that the syntactic properties of the
|
|
239 characters between the delimiters are not suppressed. Only @TeX{} mode
|
|
240 uses a paired delimiter presently---the @samp{$} that both enters and
|
|
241 leaves math mode.
|
|
242 @end deffn
|
|
243
|
|
244 @deffn {Syntax class} @w{expression prefix}
|
|
245 An @dfn{expression prefix operator} (designated with @samp{'}) is used
|
|
246 for syntactic operators that are part of an expression if they appear
|
|
247 next to one. These characters in Lisp include the apostrophe, @samp{'}
|
|
248 (used for quoting), the comma, @samp{,} (used in macros), and @samp{#}
|
|
249 (used in the read syntax for certain data types).
|
|
250 @end deffn
|
|
251
|
|
252 @deffn {Syntax class} @w{comment starter}
|
|
253 @deffnx {Syntax class} @w{comment ender}
|
|
254 @cindex comment syntax
|
|
255 The @dfn{comment starter} and @dfn{comment ender} characters are used in
|
|
256 various languages to delimit comments. These classes are designated
|
|
257 with @samp{<} and @samp{>}, respectively.
|
|
258
|
|
259 English text has no comment characters. In Lisp, the semicolon
|
|
260 (@samp{;}) starts a comment and a newline or formfeed ends one.
|
|
261 @end deffn
|
|
262
|
|
263 @deffn {Syntax class} @w{inherit}
|
|
264 This syntax class does not specify a syntax. It says to look in the
|
|
265 standard syntax table to find the syntax of this character. The
|
|
266 designator for this syntax code is @samp{@@}.
|
|
267 @end deffn
|
|
268
|
|
269 @node Syntax Flags
|
|
270 @subsection Syntax Flags
|
|
271 @cindex syntax flags
|
|
272
|
1024
|
273 @c This is a bit inaccurate, the ``a'' and ``b'' flags actually don't
|
|
274 @c exist in the internal implementation. AFAICT it doesn't affect the
|
|
275 @c semantics as perceived by the LISP programmer.
|
0
|
276 In addition to the classes, entries for characters in a syntax table
|
1024
|
277 can include flags. There are eleven possible flags, represented by the
|
|
278 digits @samp{1}--@samp{8}, and the lowercase letters @samp{a}, @samp{b},
|
|
279 and @samp{p}.
|
0
|
280
|
1024
|
281 All the flags except @samp{p} are used to describe comment delimiters.
|
|
282 The digit flags indicate that a character can @emph{also} be part of a
|
|
283 multi-character comment sequence, in addition to the syntactic
|
|
284 properties associated with its character class. The flags must be
|
0
|
285 independent of the class and each other for the sake of characters such
|
|
286 as @samp{*} in C mode, which is a punctuation character, @emph{and} the
|
|
287 second character of a start-of-comment sequence (@samp{/*}), @emph{and}
|
|
288 the first character of an end-of-comment sequence (@samp{*/}).
|
|
289
|
|
290 Emacs supports two comment styles simultaneously in any one syntax
|
|
291 table. This is for the sake of C++. Each style of comment syntax has
|
|
292 its own comment-start sequence and its own comment-end sequence. Each
|
|
293 comment must stick to one style or the other; thus, if it starts with
|
|
294 the comment-start sequence of style ``b'', it must also end with the
|
|
295 comment-end sequence of style ``b''.
|
|
296
|
1024
|
297 @c #### Compatibility note; index here.
|
|
298 As an extension to GNU Emacs 19 and 20, XEmacs supports two arbitrary
|
|
299 comment-start sequences and two arbitrary comment-end sequences. (Thus
|
|
300 the need for 8 flags.) GNU Emacs restricts the comment-start sequences
|
|
301 to start with the same character, XEmacs does not. This means that for
|
|
302 two-character sequences, where GNU Emacs uses the @samp{b} flag, XEmacs
|
|
303 uses the digit flags @samp{5}--@samp{8}.
|
0
|
304
|
1024
|
305 A one character comment-end sequence applies to the ``b'' style if its
|
|
306 first character has the @samp{b} flag set; otherwise, it applies to the
|
|
307 ``a'' style. The @samp{a} flag is optional. These flags have no effect
|
|
308 on non-comment characters; two-character styles are determined by the
|
|
309 digit flags.
|
|
310
|
|
311 The flags for a character @var{c} are:
|
0
|
312
|
1024
|
313 @itemize @bullet
|
|
314 @item
|
|
315 @samp{1} means @var{c} is the start of a two-character comment-start
|
|
316 sequence of style ``a''.
|
|
317
|
|
318 @item
|
|
319 @samp{2} means @var{c} is the second character of such a sequence.
|
0
|
320
|
1024
|
321 @item
|
|
322 @samp{3} means @var{c} is the start of a two-character comment-end
|
|
323 sequence of style ``a''.
|
|
324
|
|
325 @item
|
|
326 @samp{4} means @var{c} is the second character of such a sequence.
|
0
|
327
|
1024
|
328 @item
|
|
329 @samp{5} means @var{c} is the start of a two-character comment-start
|
|
330 sequence of style ``b''.
|
|
331
|
|
332 @item
|
|
333 @samp{6} means @var{c} is the second character of such a sequence.
|
0
|
334
|
1024
|
335 @item
|
|
336 @samp{7} means @var{c} is the start of a two-character comment-end
|
|
337 sequence of style ``b''.
|
|
338
|
|
339 @item
|
|
340 @samp{8} means @var{c} is the second character of such a sequence.
|
0
|
341
|
1024
|
342 @item
|
|
343 @samp{a} means that @var{c} as a comment delimiter belongs to the
|
|
344 default ``a'' comment style. (This flag is optional.)
|
0
|
345
|
1024
|
346 @item
|
|
347 @c Emacs 19 feature
|
|
348 @samp{b} means that @var{c} as a comment delimiter belongs to the
|
|
349 alternate ``b'' comment style.
|
0
|
350
|
|
351 @item
|
|
352 @c Emacs 19 feature
|
|
353 @samp{p} identifies an additional ``prefix character'' for Lisp syntax.
|
|
354 These characters are treated as whitespace when they appear between
|
|
355 expressions. When they appear within an expression, they are handled
|
|
356 according to their usual syntax codes.
|
|
357
|
|
358 The function @code{backward-prefix-chars} moves back over these
|
|
359 characters, as well as over characters whose primary syntax class is
|
|
360 prefix (@samp{'}). @xref{Motion and Syntax}.
|
|
361 @end itemize
|
|
362
|
1024
|
363 Lisp (as you would expect) has a simple comment syntax.
|
|
364
|
|
365 @table @asis
|
|
366 @item @samp{;}
|
|
367 @samp{<}
|
|
368 @item newline
|
|
369 @samp{>}
|
|
370 @end table
|
|
371
|
|
372 Note that no flags are used.
|
|
373 This defines two comment-delimiting sequences:
|
|
374
|
|
375 @table @asis
|
|
376 @item @samp{;}
|
|
377 This is a single-character comment-start sequence because the syntax
|
|
378 class is @samp{<}.
|
|
379
|
|
380 @item newline
|
|
381 This is a single character comment-end sequence because the syntax class
|
|
382 is @samp{>} and the @samp{b} flag is not set.
|
|
383 @end table
|
|
384
|
|
385 C++ (again, as you would expect) has a baroque, overrich, and
|
|
386 excessively complex comment syntax.
|
|
387
|
|
388 @table @asis
|
|
389 @item @samp{/}
|
|
390 @samp{1456}
|
|
391 @item @samp{*}
|
|
392 @samp{23}
|
|
393 @item newline
|
|
394 @samp{>b}
|
|
395 @end table
|
|
396
|
|
397 Note that the ``b'' style mixes one-character and two-character
|
|
398 sequences. The table above defines four comment-delimiting sequences:
|
|
399
|
|
400 @table @asis
|
|
401 @item @samp{/*}
|
|
402 This is a comment-start sequence for ``a'' style because the @samp{1}
|
|
403 flag is set on @samp{/} and the @samp{2} flag is set on @samp{*}.
|
|
404
|
|
405 @item @samp{//}
|
|
406 This is a comment-start sequence for ``b'' style because both the @samp{5}
|
|
407 and the @samp{6} flags are set on @samp{/}.
|
|
408
|
|
409 @item @samp{*/}
|
|
410 This is a comment-end sequence for ``a'' style because the @samp{3}
|
|
411 flag is set on @samp{*} and the @samp{4} flag is set on @samp{/}.
|
|
412
|
|
413 @item newline
|
|
414 This is a comment-end sequence for ``b'' style, because the newline
|
|
415 character has the @samp{b} flag.
|
|
416 @end table
|
|
417
|
|
418
|
0
|
419 @node Syntax Table Functions
|
|
420 @section Syntax Table Functions
|
|
421
|
|
422 In this section we describe functions for creating, accessing and
|
|
423 altering syntax tables.
|
|
424
|
444
|
425 @defun make-syntax-table &optional oldtable
|
0
|
426 This function creates a new syntax table. Character codes 0 through
|
|
427 31 and 128 through 255 are set up to inherit from the standard syntax
|
|
428 table. The other character codes are set up by copying what the
|
|
429 standard syntax table says about them.
|
|
430
|
|
431 Most major mode syntax tables are created in this way.
|
|
432 @end defun
|
|
433
|
444
|
434 @defun copy-syntax-table &optional syntax-table
|
|
435 This function constructs a copy of @var{syntax-table} and returns it.
|
|
436 If @var{syntax-table} is not supplied (or is @code{nil}), it returns a
|
|
437 copy of the current syntax table. Otherwise, an error is signaled if
|
|
438 @var{syntax-table} is not a syntax table.
|
0
|
439 @end defun
|
|
440
|
444
|
441 @deffn Command modify-syntax-entry char-range syntax-descriptor &optional syntax-table
|
|
442 This function sets the syntax entry for @var{char-range} according to
|
|
443 @var{syntax-descriptor}. @var{char-range} is either a single character
|
|
444 or a range of characters, as used with @code{put-char-table}. The syntax
|
|
445 is changed only for @var{syntax-table}, which defaults to the current
|
|
446 buffer's syntax table, and not in any other syntax table. The argument
|
|
447 @var{syntax-descriptor} specifies the desired syntax; this is a string
|
|
448 beginning with a class designator character, and optionally containing a
|
|
449 matching character and flags as well. @xref{Syntax Descriptors}.
|
0
|
450
|
|
451 This function always returns @code{nil}. The old syntax information in
|
444
|
452 the table for @var{char-range} is discarded.
|
0
|
453
|
|
454 An error is signaled if the first character of the syntax descriptor is not
|
444
|
455 one of the twelve syntax class designator characters.
|
0
|
456
|
|
457 @example
|
|
458 @group
|
|
459 @exdent @r{Examples:}
|
|
460
|
|
461 ;; @r{Put the space character in class whitespace.}
|
|
462 (modify-syntax-entry ?\ " ")
|
|
463 @result{} nil
|
|
464 @end group
|
|
465
|
|
466 @group
|
|
467 ;; @r{Make @samp{$} an open parenthesis character,}
|
|
468 ;; @r{with @samp{^} as its matching close.}
|
|
469 (modify-syntax-entry ?$ "(^")
|
|
470 @result{} nil
|
|
471 @end group
|
|
472
|
|
473 @group
|
|
474 ;; @r{Make @samp{^} a close parenthesis character,}
|
|
475 ;; @r{with @samp{$} as its matching open.}
|
|
476 (modify-syntax-entry ?^ ")$")
|
|
477 @result{} nil
|
|
478 @end group
|
|
479
|
|
480 @group
|
|
481 ;; @r{Make @samp{/} a punctuation character,}
|
|
482 ;; @r{the first character of a start-comment sequence,}
|
|
483 ;; @r{and the second character of an end-comment sequence.}
|
|
484 ;; @r{This is used in C mode.}
|
|
485 (modify-syntax-entry ?/ ". 14")
|
|
486 @result{} nil
|
|
487 @end group
|
|
488 @end example
|
|
489 @end deffn
|
|
490
|
444
|
491 @defun char-syntax character &optional syntax-table
|
0
|
492 This function returns the syntax class of @var{character}, represented
|
|
493 by its mnemonic designator character. This @emph{only} returns the
|
|
494 class, not any matching parenthesis or flags.
|
|
495
|
444
|
496 An error is signaled if @var{character} is not a character.
|
|
497
|
|
498 The characters that correspond to various syntax codes
|
|
499 are listed in the documentation of @code{modify-syntax-entry}.
|
|
500
|
|
501 Optional second argument @var{syntax-table} is the syntax table to be
|
|
502 used, and defaults to the current buffer's syntax table.
|
0
|
503
|
|
504 The following examples apply to C mode. The first example shows that
|
|
505 the syntax class of space is whitespace (represented by a space). The
|
|
506 second example shows that the syntax of @samp{/} is punctuation. This
|
|
507 does not show the fact that it is also part of comment-start and -end
|
|
508 sequences. The third example shows that open parenthesis is in the class
|
|
509 of open parentheses. This does not show the fact that it has a matching
|
|
510 character, @samp{)}.
|
|
511
|
|
512 @example
|
|
513 @group
|
|
514 (char-to-string (char-syntax ?\ ))
|
|
515 @result{} " "
|
|
516 @end group
|
|
517
|
|
518 @group
|
|
519 (char-to-string (char-syntax ?/))
|
|
520 @result{} "."
|
|
521 @end group
|
|
522
|
|
523 @group
|
|
524 (char-to-string (char-syntax ?\())
|
|
525 @result{} "("
|
|
526 @end group
|
|
527 @end example
|
|
528 @end defun
|
|
529
|
444
|
530 @defun set-syntax-table syntax-table &optional buffer
|
|
531 This function makes @var{syntax-table} the syntax table for @var{buffer}, which
|
|
532 defaults to the current buffer if omitted. It returns @var{syntax-table}.
|
0
|
533 @end defun
|
|
534
|
|
535 @defun syntax-table &optional buffer
|
|
536 This function returns the syntax table for @var{buffer}, which defaults
|
|
537 to the current buffer if omitted.
|
|
538 @end defun
|
|
539
|
|
540 @node Motion and Syntax
|
|
541 @section Motion and Syntax
|
|
542
|
|
543 This section describes functions for moving across characters in
|
|
544 certain syntax classes. None of these functions exists in Emacs
|
|
545 version 18 or earlier.
|
|
546
|
|
547 @defun skip-syntax-forward syntaxes &optional limit buffer
|
|
548 This function moves point forward across characters having syntax classes
|
|
549 mentioned in @var{syntaxes}. It stops when it encounters the end of
|
|
550 the buffer, or position @var{limit} (if specified), or a character it is
|
|
551 not supposed to skip. Optional argument @var{buffer} defaults to the
|
|
552 current buffer if omitted.
|
|
553 @ignore @c may want to change this.
|
|
554 The return value is the distance traveled, which is a nonnegative
|
|
555 integer.
|
|
556 @end ignore
|
|
557 @end defun
|
|
558
|
|
559 @defun skip-syntax-backward syntaxes &optional limit buffer
|
|
560 This function moves point backward across characters whose syntax
|
|
561 classes are mentioned in @var{syntaxes}. It stops when it encounters
|
|
562 the beginning of the buffer, or position @var{limit} (if specified), or a
|
|
563 character it is not supposed to skip. Optional argument @var{buffer}
|
|
564 defaults to the current buffer if omitted.
|
|
565
|
|
566 @ignore @c may want to change this.
|
|
567 The return value indicates the distance traveled. It is an integer that
|
|
568 is zero or less.
|
|
569 @end ignore
|
|
570 @end defun
|
|
571
|
|
572 @defun backward-prefix-chars &optional buffer
|
|
573 This function moves point backward over any number of characters with
|
|
574 expression prefix syntax. This includes both characters in the
|
|
575 expression prefix syntax class, and characters with the @samp{p} flag.
|
|
576 Optional argument @var{buffer} defaults to the current buffer if
|
|
577 omitted.
|
|
578 @end defun
|
|
579
|
|
580 @node Parsing Expressions
|
|
581 @section Parsing Balanced Expressions
|
|
582
|
|
583 Here are several functions for parsing and scanning balanced
|
|
584 expressions, also known as @dfn{sexps}, in which parentheses match in
|
|
585 pairs. The syntax table controls the interpretation of characters, so
|
|
586 these functions can be used for Lisp expressions when in Lisp mode and
|
|
587 for C expressions when in C mode. @xref{List Motion}, for convenient
|
|
588 higher-level functions for moving over balanced expressions.
|
|
589
|
|
590 @defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment buffer
|
|
591 This function parses a sexp in the current buffer starting at
|
|
592 @var{start}, not scanning past @var{limit}. It stops at position
|
|
593 @var{limit} or when certain criteria described below are met, and sets
|
|
594 point to the location where parsing stops. It returns a value
|
|
595 describing the status of the parse at the point where it stops.
|
|
596
|
|
597 If @var{state} is @code{nil}, @var{start} is assumed to be at the top
|
|
598 level of parenthesis structure, such as the beginning of a function
|
|
599 definition. Alternatively, you might wish to resume parsing in the
|
|
600 middle of the structure. To do this, you must provide a @var{state}
|
|
601 argument that describes the initial status of parsing.
|
|
602
|
|
603 @cindex parenthesis depth
|
|
604 If the third argument @var{target-depth} is non-@code{nil}, parsing
|
|
605 stops if the depth in parentheses becomes equal to @var{target-depth}.
|
|
606 The depth starts at 0, or at whatever is given in @var{state}.
|
|
607
|
|
608 If the fourth argument @var{stop-before} is non-@code{nil}, parsing
|
|
609 stops when it comes to any character that starts a sexp. If
|
|
610 @var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
|
|
611 start of a comment.
|
|
612
|
|
613 @cindex parse state
|
|
614 The fifth argument @var{state} is an eight-element list of the same
|
|
615 form as the value of this function, described below. The return value
|
|
616 of one call may be used to initialize the state of the parse on another
|
|
617 call to @code{parse-partial-sexp}.
|
|
618
|
|
619 The result is a list of eight elements describing the final state of
|
|
620 the parse:
|
|
621
|
|
622 @enumerate 0
|
444
|
623 @item
|
0
|
624 The depth in parentheses, counting from 0.
|
|
625
|
444
|
626 @item
|
0
|
627 @cindex innermost containing parentheses
|
|
628 The character position of the start of the innermost parenthetical
|
|
629 grouping containing the stopping point; @code{nil} if none.
|
|
630
|
444
|
631 @item
|
0
|
632 @cindex previous complete subexpression
|
|
633 The character position of the start of the last complete subexpression
|
|
634 terminated; @code{nil} if none.
|
|
635
|
444
|
636 @item
|
0
|
637 @cindex inside string
|
|
638 Non-@code{nil} if inside a string. More precisely, this is the
|
|
639 character that will terminate the string.
|
|
640
|
444
|
641 @item
|
0
|
642 @cindex inside comment
|
|
643 @code{t} if inside a comment (of either style).
|
|
644
|
444
|
645 @item
|
0
|
646 @cindex quote character
|
|
647 @code{t} if point is just after a quote character.
|
|
648
|
444
|
649 @item
|
0
|
650 The minimum parenthesis depth encountered during this scan.
|
|
651
|
|
652 @item
|
|
653 @code{t} if inside a comment of style ``b''.
|
|
654 @end enumerate
|
|
655
|
|
656 Elements 0, 3, 4, 5 and 7 are significant in the argument @var{state}.
|
|
657
|
|
658 @cindex indenting with parentheses
|
|
659 This function is most often used to compute indentation for languages
|
|
660 that have nested parentheses.
|
|
661 @end defun
|
|
662
|
|
663 @defun scan-lists from count depth &optional buffer noerror
|
|
664 This function scans forward @var{count} balanced parenthetical groupings
|
|
665 from character number @var{from}. It returns the character position
|
|
666 where the scan stops.
|
|
667
|
|
668 If @var{depth} is nonzero, parenthesis depth counting begins from that
|
|
669 value. The only candidates for stopping are places where the depth in
|
|
670 parentheses becomes zero; @code{scan-lists} counts @var{count} such
|
|
671 places and then stops. Thus, a positive value for @var{depth} means go
|
|
672 out @var{depth} levels of parenthesis.
|
|
673
|
|
674 Scanning ignores comments if @code{parse-sexp-ignore-comments} is
|
|
675 non-@code{nil}.
|
|
676
|
|
677 If the scan reaches the beginning or end of the buffer (or its
|
|
678 accessible portion), and the depth is not zero, an error is signaled.
|
|
679 If the depth is zero but the count is not used up, @code{nil} is
|
|
680 returned.
|
|
681
|
|
682 If optional arg @var{buffer} is non-@code{nil}, scanning occurs in that
|
|
683 buffer instead of in the current buffer.
|
|
684
|
|
685 If optional arg @var{noerror} is non-@code{nil}, @code{scan-lists}
|
|
686 will return @code{nil} instead of signalling an error.
|
|
687 @end defun
|
|
688
|
|
689 @defun scan-sexps from count &optional buffer noerror
|
|
690 This function scans forward @var{count} sexps from character position
|
|
691 @var{from}. It returns the character position where the scan stops.
|
|
692
|
|
693 Scanning ignores comments if @code{parse-sexp-ignore-comments} is
|
|
694 non-@code{nil}.
|
|
695
|
|
696 If the scan reaches the beginning or end of (the accessible part of) the
|
|
697 buffer in the middle of a parenthetical grouping, an error is signaled.
|
|
698 If it reaches the beginning or end between groupings but before count is
|
|
699 used up, @code{nil} is returned.
|
|
700
|
|
701 If optional arg @var{buffer} is non-@code{nil}, scanning occurs in
|
|
702 that buffer instead of in the current buffer.
|
|
703
|
|
704 If optional arg @var{noerror} is non-@code{nil}, @code{scan-sexps}
|
|
705 will return nil instead of signalling an error.
|
|
706 @end defun
|
|
707
|
|
708 @defvar parse-sexp-ignore-comments
|
|
709 @cindex skipping comments
|
|
710 If the value is non-@code{nil}, then comments are treated as
|
|
711 whitespace by the functions in this section and by @code{forward-sexp}.
|
|
712
|
|
713 In older Emacs versions, this feature worked only when the comment
|
|
714 terminator is something like @samp{*/}, and appears only to end a
|
|
715 comment. In languages where newlines terminate comments, it was
|
|
716 necessary make this variable @code{nil}, since not every newline is the
|
|
717 end of a comment. This limitation no longer exists.
|
|
718 @end defvar
|
|
719
|
|
720 You can use @code{forward-comment} to move forward or backward over
|
|
721 one comment or several comments.
|
|
722
|
446
|
723 @defun forward-comment &optional count buffer
|
0
|
724 This function moves point forward across @var{count} comments (backward,
|
|
725 if @var{count} is negative). If it finds anything other than a comment
|
|
726 or whitespace, it stops, leaving point at the place where it stopped.
|
446
|
727 It also stops after satisfying @var{count}. @var{count} defaults to @code{1}.
|
0
|
728
|
446
|
729 Optional argument @var{buffer} defaults to the current buffer.
|
0
|
730 @end defun
|
|
731
|
|
732 To move forward over all comments and whitespace following point, use
|
|
733 @code{(forward-comment (buffer-size))}. @code{(buffer-size)} is a good
|
|
734 argument to use, because the number of comments in the buffer cannot
|
|
735 exceed that many.
|
|
736
|
|
737 @node Standard Syntax Tables
|
|
738 @section Some Standard Syntax Tables
|
|
739
|
|
740 Most of the major modes in XEmacs have their own syntax tables. Here
|
|
741 are several of them:
|
|
742
|
|
743 @defun standard-syntax-table
|
|
744 This function returns the standard syntax table, which is the syntax
|
|
745 table used in Fundamental mode.
|
|
746 @end defun
|
|
747
|
|
748 @defvar text-mode-syntax-table
|
|
749 The value of this variable is the syntax table used in Text mode.
|
|
750 @end defvar
|
|
751
|
|
752 @defvar c-mode-syntax-table
|
|
753 The value of this variable is the syntax table for C-mode buffers.
|
|
754 @end defvar
|
|
755
|
|
756 @defvar emacs-lisp-mode-syntax-table
|
|
757 The value of this variable is the syntax table used in Emacs Lisp mode
|
|
758 by editing commands. (It has no effect on the Lisp @code{read}
|
|
759 function.)
|
|
760 @end defvar
|
|
761
|
|
762 @node Syntax Table Internals
|
|
763 @section Syntax Table Internals
|
|
764 @cindex syntax table internals
|
|
765
|
|
766 Each element of a syntax table is an integer that encodes the syntax
|
|
767 of one character: the syntax class, possible matching character, and
|
|
768 flags. Lisp programs don't usually work with the elements directly; the
|
|
769 Lisp-level syntax table functions usually work with syntax descriptors
|
|
770 (@pxref{Syntax Descriptors}).
|
|
771
|
|
772 The low 8 bits of each element of a syntax table indicate the
|
|
773 syntax class.
|
|
774
|
|
775 @table @asis
|
|
776 @item @i{Integer}
|
|
777 @i{Class}
|
|
778 @item 0
|
|
779 whitespace
|
|
780 @item 1
|
|
781 punctuation
|
|
782 @item 2
|
|
783 word
|
|
784 @item 3
|
|
785 symbol
|
|
786 @item 4
|
|
787 open parenthesis
|
|
788 @item 5
|
|
789 close parenthesis
|
|
790 @item 6
|
|
791 expression prefix
|
|
792 @item 7
|
|
793 string quote
|
|
794 @item 8
|
|
795 paired delimiter
|
|
796 @item 9
|
|
797 escape
|
|
798 @item 10
|
|
799 character quote
|
|
800 @item 11
|
|
801 comment-start
|
|
802 @item 12
|
|
803 comment-end
|
|
804 @item 13
|
|
805 inherit
|
|
806 @end table
|
|
807
|
|
808 The next 8 bits are the matching opposite parenthesis (if the
|
|
809 character has parenthesis syntax); otherwise, they are not meaningful.
|
|
810 The next 6 bits are the flags.
|