0
|
1 @c -*-texinfo-*-
|
|
2 @c This is part of the XEmacs Lisp Reference Manual.
|
444
|
3 @c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc.
|
0
|
4 @c See the file lispref.texi for copying conditions.
|
|
5 @setfilename ../../info/syntax.info
|
|
6 @node Syntax Tables, Abbrevs, Searching and Matching, Top
|
|
7 @chapter Syntax Tables
|
|
8 @cindex parsing
|
|
9 @cindex syntax table
|
|
10 @cindex text parsing
|
|
11
|
|
12 A @dfn{syntax table} specifies the syntactic textual function of each
|
|
13 character. This information is used by the parsing commands, the
|
|
14 complex movement commands, and others to determine where words, symbols,
|
|
15 and other syntactic constructs begin and end. The current syntax table
|
|
16 controls the meaning of the word motion functions (@pxref{Word Motion})
|
|
17 and the list motion functions (@pxref{List Motion}) as well as the
|
|
18 functions in this chapter.
|
|
19
|
|
20 @menu
|
|
21 * Basics: Syntax Basics. Basic concepts of syntax tables.
|
|
22 * Desc: Syntax Descriptors. How characters are classified.
|
|
23 * Syntax Table Functions:: How to create, examine and alter syntax tables.
|
|
24 * Motion and Syntax:: Moving over characters with certain syntaxes.
|
|
25 * Parsing Expressions:: Parsing balanced expressions
|
|
26 using the syntax table.
|
|
27 * Standard Syntax Tables:: Syntax tables used by various major modes.
|
|
28 * Syntax Table Internals:: How syntax table information is stored.
|
|
29 @end menu
|
|
30
|
|
31 @node Syntax Basics
|
|
32 @section Syntax Table Concepts
|
|
33
|
|
34 @ifinfo
|
|
35 A @dfn{syntax table} provides Emacs with the information that
|
|
36 determines the syntactic use of each character in a buffer. This
|
|
37 information is used by the parsing commands, the complex movement
|
|
38 commands, and others to determine where words, symbols, and other
|
|
39 syntactic constructs begin and end. The current syntax table controls
|
|
40 the meaning of the word motion functions (@pxref{Word Motion}) and the
|
|
41 list motion functions (@pxref{List Motion}) as well as the functions in
|
|
42 this chapter.
|
|
43 @end ifinfo
|
|
44
|
|
45 Under XEmacs 20, a syntax table is a particular subtype of the
|
|
46 primitive char table type (@pxref{Char Tables}), and each element of the
|
|
47 char table is an integer that encodes the syntax of the character in
|
|
48 question, or a cons of such an integer and a matching character (for
|
|
49 characters with parenthesis syntax).
|
|
50
|
|
51 Under XEmacs 19, a syntax table is a vector of 256 elements; it
|
|
52 contains one entry for each of the 256 possible characters in an 8-bit
|
|
53 byte. Each element is an integer that encodes the syntax of the
|
|
54 character in question. (The matching character, if any, is embedded
|
|
55 in the bits of this integer.)
|
|
56
|
|
57 Syntax tables are used only for moving across text, not for the Emacs
|
|
58 Lisp reader. XEmacs Lisp uses built-in syntactic rules when reading Lisp
|
|
59 expressions, and these rules cannot be changed.
|
|
60
|
|
61 Each buffer has its own major mode, and each major mode has its own
|
|
62 idea of the syntactic class of various characters. For example, in Lisp
|
|
63 mode, the character @samp{;} begins a comment, but in C mode, it
|
|
64 terminates a statement. To support these variations, XEmacs makes the
|
|
65 choice of syntax table local to each buffer. Typically, each major
|
|
66 mode has its own syntax table and installs that table in each buffer
|
|
67 that uses that mode. Changing this table alters the syntax in all
|
|
68 those buffers as well as in any buffers subsequently put in that mode.
|
|
69 Occasionally several similar modes share one syntax table.
|
|
70 @xref{Example Major Modes}, for an example of how to set up a syntax
|
|
71 table.
|
|
72
|
|
73 A syntax table can inherit the data for some characters from the
|
|
74 standard syntax table, while specifying other characters itself. The
|
|
75 ``inherit'' syntax class means ``inherit this character's syntax from
|
|
76 the standard syntax table.'' Most major modes' syntax tables inherit
|
|
77 the syntax of character codes 0 through 31 and 128 through 255. This is
|
|
78 useful with character sets such as ISO Latin-1 that have additional
|
|
79 alphabetic characters in the range 128 to 255. Just changing the
|
|
80 standard syntax for these characters affects all major modes.
|
|
81
|
|
82 @defun syntax-table-p object
|
|
83 This function returns @code{t} if @var{object} is a vector of length 256
|
|
84 elements. This means that the vector may be a syntax table. However,
|
|
85 according to this test, any vector of length 256 is considered to be a
|
|
86 syntax table, no matter what its contents.
|
|
87 @end defun
|
|
88
|
|
89 @node Syntax Descriptors
|
|
90 @section Syntax Descriptors
|
|
91 @cindex syntax classes
|
|
92
|
|
93 This section describes the syntax classes and flags that denote the
|
|
94 syntax of a character, and how they are represented as a @dfn{syntax
|
|
95 descriptor}, which is a Lisp string that you pass to
|
|
96 @code{modify-syntax-entry} to specify the desired syntax.
|
|
97
|
|
98 XEmacs defines a number of @dfn{syntax classes}. Each syntax table
|
|
99 puts each character into one class. There is no necessary relationship
|
|
100 between the class of a character in one syntax table and its class in
|
|
101 any other table.
|
|
102
|
|
103 Each class is designated by a mnemonic character, which serves as the
|
|
104 name of the class when you need to specify a class. Usually the
|
|
105 designator character is one that is frequently in that class; however,
|
|
106 its meaning as a designator is unvarying and independent of what syntax
|
|
107 that character currently has.
|
|
108
|
|
109 @cindex syntax descriptor
|
|
110 A syntax descriptor is a Lisp string that specifies a syntax class, a
|
|
111 matching character (used only for the parenthesis classes) and flags.
|
|
112 The first character is the designator for a syntax class. The second
|
|
113 character is the character to match; if it is unused, put a space there.
|
|
114 Then come the characters for any desired flags. If no matching
|
|
115 character or flags are needed, one character is sufficient.
|
|
116
|
|
117 For example, the descriptor for the character @samp{*} in C mode is
|
|
118 @samp{@w{. 23}} (i.e., punctuation, matching character slot unused,
|
|
119 second character of a comment-starter, first character of an
|
|
120 comment-ender), and the entry for @samp{/} is @samp{@w{. 14}} (i.e.,
|
|
121 punctuation, matching character slot unused, first character of a
|
|
122 comment-starter, second character of a comment-ender).
|
|
123
|
|
124 @menu
|
|
125 * Syntax Class Table:: Table of syntax classes.
|
|
126 * Syntax Flags:: Additional flags each character can have.
|
|
127 @end menu
|
|
128
|
|
129 @node Syntax Class Table
|
|
130 @subsection Table of Syntax Classes
|
|
131
|
|
132 Here is a table of syntax classes, the characters that stand for them,
|
|
133 their meanings, and examples of their use.
|
|
134
|
|
135 @deffn {Syntax class} @w{whitespace character}
|
|
136 @dfn{Whitespace characters} (designated with @w{@samp{@ }} or @samp{-})
|
|
137 separate symbols and words from each other. Typically, whitespace
|
|
138 characters have no other syntactic significance, and multiple whitespace
|
|
139 characters are syntactically equivalent to a single one. Space, tab,
|
|
140 newline and formfeed are almost always classified as whitespace.
|
|
141 @end deffn
|
|
142
|
|
143 @deffn {Syntax class} @w{word constituent}
|
|
144 @dfn{Word constituents} (designated with @samp{w}) are parts of normal
|
|
145 English words and are typically used in variable and command names in
|
|
146 programs. All upper- and lower-case letters, and the digits, are typically
|
|
147 word constituents.
|
|
148 @end deffn
|
|
149
|
|
150 @deffn {Syntax class} @w{symbol constituent}
|
|
151 @dfn{Symbol constituents} (designated with @samp{_}) are the extra
|
|
152 characters that are used in variable and command names along with word
|
|
153 constituents. For example, the symbol constituents class is used in
|
|
154 Lisp mode to indicate that certain characters may be part of symbol
|
|
155 names even though they are not part of English words. These characters
|
|
156 are @samp{$&*+-_<>}. In standard C, the only non-word-constituent
|
|
157 character that is valid in symbols is underscore (@samp{_}).
|
|
158 @end deffn
|
|
159
|
|
160 @deffn {Syntax class} @w{punctuation character}
|
|
161 @dfn{Punctuation characters} (@samp{.}) are those characters that are
|
|
162 used as punctuation in English, or are used in some way in a programming
|
|
163 language to separate symbols from one another. Most programming
|
|
164 language modes, including Emacs Lisp mode, have no characters in this
|
|
165 class since the few characters that are not symbol or word constituents
|
|
166 all have other uses.
|
|
167 @end deffn
|
|
168
|
|
169 @deffn {Syntax class} @w{open parenthesis character}
|
|
170 @deffnx {Syntax class} @w{close parenthesis character}
|
|
171 @cindex parenthesis syntax
|
|
172 Open and close @dfn{parenthesis characters} are characters used in
|
|
173 dissimilar pairs to surround sentences or expressions. Such a grouping
|
|
174 is begun with an open parenthesis character and terminated with a close.
|
|
175 Each open parenthesis character matches a particular close parenthesis
|
|
176 character, and vice versa. Normally, XEmacs indicates momentarily the
|
|
177 matching open parenthesis when you insert a close parenthesis.
|
|
178 @xref{Blinking}.
|
|
179
|
|
180 The class of open parentheses is designated with @samp{(}, and that of
|
|
181 close parentheses with @samp{)}.
|
|
182
|
|
183 In English text, and in C code, the parenthesis pairs are @samp{()},
|
|
184 @samp{[]}, and @samp{@{@}}. In XEmacs Lisp, the delimiters for lists and
|
|
185 vectors (@samp{()} and @samp{[]}) are classified as parenthesis
|
|
186 characters.
|
|
187 @end deffn
|
|
188
|
|
189 @deffn {Syntax class} @w{string quote}
|
|
190 @dfn{String quote characters} (designated with @samp{"}) are used in
|
|
191 many languages, including Lisp and C, to delimit string constants. The
|
|
192 same string quote character appears at the beginning and the end of a
|
|
193 string. Such quoted strings do not nest.
|
|
194
|
|
195 The parsing facilities of XEmacs consider a string as a single token.
|
|
196 The usual syntactic meanings of the characters in the string are
|
|
197 suppressed.
|
|
198
|
|
199 The Lisp modes have two string quote characters: double-quote (@samp{"})
|
|
200 and vertical bar (@samp{|}). @samp{|} is not used in XEmacs Lisp, but it
|
|
201 is used in Common Lisp. C also has two string quote characters:
|
|
202 double-quote for strings, and single-quote (@samp{'}) for character
|
|
203 constants.
|
|
204
|
|
205 English text has no string quote characters because English is not a
|
|
206 programming language. Although quotation marks are used in English,
|
|
207 we do not want them to turn off the usual syntactic properties of
|
|
208 other characters in the quotation.
|
|
209 @end deffn
|
|
210
|
|
211 @deffn {Syntax class} @w{escape}
|
|
212 An @dfn{escape character} (designated with @samp{\}) starts an escape
|
|
213 sequence such as is used in C string and character constants. The
|
|
214 character @samp{\} belongs to this class in both C and Lisp. (In C, it
|
|
215 is used thus only inside strings, but it turns out to cause no trouble
|
|
216 to treat it this way throughout C code.)
|
|
217
|
|
218 Characters in this class count as part of words if
|
|
219 @code{words-include-escapes} is non-@code{nil}. @xref{Word Motion}.
|
|
220 @end deffn
|
|
221
|
|
222 @deffn {Syntax class} @w{character quote}
|
|
223 A @dfn{character quote character} (designated with @samp{/}) quotes the
|
|
224 following character so that it loses its normal syntactic meaning. This
|
|
225 differs from an escape character in that only the character immediately
|
|
226 following is ever affected.
|
|
227
|
|
228 Characters in this class count as part of words if
|
|
229 @code{words-include-escapes} is non-@code{nil}. @xref{Word Motion}.
|
|
230
|
|
231 This class is used for backslash in @TeX{} mode.
|
|
232 @end deffn
|
|
233
|
|
234 @deffn {Syntax class} @w{paired delimiter}
|
|
235 @dfn{Paired delimiter characters} (designated with @samp{$}) are like
|
|
236 string quote characters except that the syntactic properties of the
|
|
237 characters between the delimiters are not suppressed. Only @TeX{} mode
|
|
238 uses a paired delimiter presently---the @samp{$} that both enters and
|
|
239 leaves math mode.
|
|
240 @end deffn
|
|
241
|
|
242 @deffn {Syntax class} @w{expression prefix}
|
|
243 An @dfn{expression prefix operator} (designated with @samp{'}) is used
|
|
244 for syntactic operators that are part of an expression if they appear
|
|
245 next to one. These characters in Lisp include the apostrophe, @samp{'}
|
|
246 (used for quoting), the comma, @samp{,} (used in macros), and @samp{#}
|
|
247 (used in the read syntax for certain data types).
|
|
248 @end deffn
|
|
249
|
|
250 @deffn {Syntax class} @w{comment starter}
|
|
251 @deffnx {Syntax class} @w{comment ender}
|
|
252 @cindex comment syntax
|
|
253 The @dfn{comment starter} and @dfn{comment ender} characters are used in
|
|
254 various languages to delimit comments. These classes are designated
|
|
255 with @samp{<} and @samp{>}, respectively.
|
|
256
|
|
257 English text has no comment characters. In Lisp, the semicolon
|
|
258 (@samp{;}) starts a comment and a newline or formfeed ends one.
|
|
259 @end deffn
|
|
260
|
|
261 @deffn {Syntax class} @w{inherit}
|
|
262 This syntax class does not specify a syntax. It says to look in the
|
|
263 standard syntax table to find the syntax of this character. The
|
|
264 designator for this syntax code is @samp{@@}.
|
|
265 @end deffn
|
|
266
|
|
267 @node Syntax Flags
|
|
268 @subsection Syntax Flags
|
|
269 @cindex syntax flags
|
|
270
|
|
271 In addition to the classes, entries for characters in a syntax table
|
|
272 can include flags. There are six possible flags, represented by the
|
|
273 characters @samp{1}, @samp{2}, @samp{3}, @samp{4}, @samp{b} and
|
|
274 @samp{p}.
|
|
275
|
|
276 All the flags except @samp{p} are used to describe multi-character
|
|
277 comment delimiters. The digit flags indicate that a character can
|
|
278 @emph{also} be part of a comment sequence, in addition to the syntactic
|
|
279 properties associated with its character class. The flags are
|
|
280 independent of the class and each other for the sake of characters such
|
|
281 as @samp{*} in C mode, which is a punctuation character, @emph{and} the
|
|
282 second character of a start-of-comment sequence (@samp{/*}), @emph{and}
|
|
283 the first character of an end-of-comment sequence (@samp{*/}).
|
|
284
|
|
285 The flags for a character @var{c} are:
|
|
286
|
|
287 @itemize @bullet
|
|
288 @item
|
|
289 @samp{1} means @var{c} is the start of a two-character comment-start
|
|
290 sequence.
|
|
291
|
|
292 @item
|
|
293 @samp{2} means @var{c} is the second character of such a sequence.
|
|
294
|
|
295 @item
|
|
296 @samp{3} means @var{c} is the start of a two-character comment-end
|
|
297 sequence.
|
|
298
|
|
299 @item
|
|
300 @samp{4} means @var{c} is the second character of such a sequence.
|
|
301
|
|
302 @item
|
|
303 @c Emacs 19 feature
|
|
304 @samp{b} means that @var{c} as a comment delimiter belongs to the
|
|
305 alternative ``b'' comment style.
|
|
306
|
|
307 Emacs supports two comment styles simultaneously in any one syntax
|
|
308 table. This is for the sake of C++. Each style of comment syntax has
|
|
309 its own comment-start sequence and its own comment-end sequence. Each
|
|
310 comment must stick to one style or the other; thus, if it starts with
|
|
311 the comment-start sequence of style ``b'', it must also end with the
|
|
312 comment-end sequence of style ``b''.
|
|
313
|
|
314 The two comment-start sequences must begin with the same character; only
|
|
315 the second character may differ. Mark the second character of the
|
|
316 ``b''-style comment-start sequence with the @samp{b} flag.
|
|
317
|
|
318 A comment-end sequence (one or two characters) applies to the ``b''
|
|
319 style if its first character has the @samp{b} flag set; otherwise, it
|
|
320 applies to the ``a'' style.
|
|
321
|
|
322 The appropriate comment syntax settings for C++ are as follows:
|
|
323
|
|
324 @table @asis
|
|
325 @item @samp{/}
|
|
326 @samp{124b}
|
|
327 @item @samp{*}
|
|
328 @samp{23}
|
|
329 @item newline
|
|
330 @samp{>b}
|
|
331 @end table
|
|
332
|
|
333 This defines four comment-delimiting sequences:
|
|
334
|
|
335 @table @asis
|
|
336 @item @samp{/*}
|
|
337 This is a comment-start sequence for ``a'' style because the
|
|
338 second character, @samp{*}, does not have the @samp{b} flag.
|
|
339
|
|
340 @item @samp{//}
|
|
341 This is a comment-start sequence for ``b'' style because the second
|
|
342 character, @samp{/}, does have the @samp{b} flag.
|
|
343
|
|
344 @item @samp{*/}
|
|
345 This is a comment-end sequence for ``a'' style because the first
|
|
346 character, @samp{*}, does not have the @samp{b} flag
|
|
347
|
|
348 @item newline
|
|
349 This is a comment-end sequence for ``b'' style, because the newline
|
|
350 character has the @samp{b} flag.
|
|
351 @end table
|
|
352
|
|
353 @item
|
|
354 @c Emacs 19 feature
|
|
355 @samp{p} identifies an additional ``prefix character'' for Lisp syntax.
|
|
356 These characters are treated as whitespace when they appear between
|
|
357 expressions. When they appear within an expression, they are handled
|
|
358 according to their usual syntax codes.
|
|
359
|
|
360 The function @code{backward-prefix-chars} moves back over these
|
|
361 characters, as well as over characters whose primary syntax class is
|
|
362 prefix (@samp{'}). @xref{Motion and Syntax}.
|
|
363 @end itemize
|
|
364
|
|
365 @node Syntax Table Functions
|
|
366 @section Syntax Table Functions
|
|
367
|
|
368 In this section we describe functions for creating, accessing and
|
|
369 altering syntax tables.
|
|
370
|
444
|
371 @defun make-syntax-table &optional oldtable
|
0
|
372 This function creates a new syntax table. Character codes 0 through
|
|
373 31 and 128 through 255 are set up to inherit from the standard syntax
|
|
374 table. The other character codes are set up by copying what the
|
|
375 standard syntax table says about them.
|
|
376
|
|
377 Most major mode syntax tables are created in this way.
|
|
378 @end defun
|
|
379
|
444
|
380 @defun copy-syntax-table &optional syntax-table
|
|
381 This function constructs a copy of @var{syntax-table} and returns it.
|
|
382 If @var{syntax-table} is not supplied (or is @code{nil}), it returns a
|
|
383 copy of the current syntax table. Otherwise, an error is signaled if
|
|
384 @var{syntax-table} is not a syntax table.
|
0
|
385 @end defun
|
|
386
|
444
|
387 @deffn Command modify-syntax-entry char-range syntax-descriptor &optional syntax-table
|
|
388 This function sets the syntax entry for @var{char-range} according to
|
|
389 @var{syntax-descriptor}. @var{char-range} is either a single character
|
|
390 or a range of characters, as used with @code{put-char-table}. The syntax
|
|
391 is changed only for @var{syntax-table}, which defaults to the current
|
|
392 buffer's syntax table, and not in any other syntax table. The argument
|
|
393 @var{syntax-descriptor} specifies the desired syntax; this is a string
|
|
394 beginning with a class designator character, and optionally containing a
|
|
395 matching character and flags as well. @xref{Syntax Descriptors}.
|
0
|
396
|
|
397 This function always returns @code{nil}. The old syntax information in
|
444
|
398 the table for @var{char-range} is discarded.
|
0
|
399
|
|
400 An error is signaled if the first character of the syntax descriptor is not
|
444
|
401 one of the twelve syntax class designator characters.
|
0
|
402
|
|
403 @example
|
|
404 @group
|
|
405 @exdent @r{Examples:}
|
|
406
|
|
407 ;; @r{Put the space character in class whitespace.}
|
|
408 (modify-syntax-entry ?\ " ")
|
|
409 @result{} nil
|
|
410 @end group
|
|
411
|
|
412 @group
|
|
413 ;; @r{Make @samp{$} an open parenthesis character,}
|
|
414 ;; @r{with @samp{^} as its matching close.}
|
|
415 (modify-syntax-entry ?$ "(^")
|
|
416 @result{} nil
|
|
417 @end group
|
|
418
|
|
419 @group
|
|
420 ;; @r{Make @samp{^} a close parenthesis character,}
|
|
421 ;; @r{with @samp{$} as its matching open.}
|
|
422 (modify-syntax-entry ?^ ")$")
|
|
423 @result{} nil
|
|
424 @end group
|
|
425
|
|
426 @group
|
|
427 ;; @r{Make @samp{/} a punctuation character,}
|
|
428 ;; @r{the first character of a start-comment sequence,}
|
|
429 ;; @r{and the second character of an end-comment sequence.}
|
|
430 ;; @r{This is used in C mode.}
|
|
431 (modify-syntax-entry ?/ ". 14")
|
|
432 @result{} nil
|
|
433 @end group
|
|
434 @end example
|
|
435 @end deffn
|
|
436
|
444
|
437 @defun char-syntax character &optional syntax-table
|
0
|
438 This function returns the syntax class of @var{character}, represented
|
|
439 by its mnemonic designator character. This @emph{only} returns the
|
|
440 class, not any matching parenthesis or flags.
|
|
441
|
444
|
442 An error is signaled if @var{character} is not a character.
|
|
443
|
|
444 The characters that correspond to various syntax codes
|
|
445 are listed in the documentation of @code{modify-syntax-entry}.
|
|
446
|
|
447 Optional second argument @var{syntax-table} is the syntax table to be
|
|
448 used, and defaults to the current buffer's syntax table.
|
0
|
449
|
|
450 The following examples apply to C mode. The first example shows that
|
|
451 the syntax class of space is whitespace (represented by a space). The
|
|
452 second example shows that the syntax of @samp{/} is punctuation. This
|
|
453 does not show the fact that it is also part of comment-start and -end
|
|
454 sequences. The third example shows that open parenthesis is in the class
|
|
455 of open parentheses. This does not show the fact that it has a matching
|
|
456 character, @samp{)}.
|
|
457
|
|
458 @example
|
|
459 @group
|
|
460 (char-to-string (char-syntax ?\ ))
|
|
461 @result{} " "
|
|
462 @end group
|
|
463
|
|
464 @group
|
|
465 (char-to-string (char-syntax ?/))
|
|
466 @result{} "."
|
|
467 @end group
|
|
468
|
|
469 @group
|
|
470 (char-to-string (char-syntax ?\())
|
|
471 @result{} "("
|
|
472 @end group
|
|
473 @end example
|
|
474 @end defun
|
|
475
|
444
|
476 @defun set-syntax-table syntax-table &optional buffer
|
|
477 This function makes @var{syntax-table} the syntax table for @var{buffer}, which
|
|
478 defaults to the current buffer if omitted. It returns @var{syntax-table}.
|
0
|
479 @end defun
|
|
480
|
|
481 @defun syntax-table &optional buffer
|
|
482 This function returns the syntax table for @var{buffer}, which defaults
|
|
483 to the current buffer if omitted.
|
|
484 @end defun
|
|
485
|
|
486 @node Motion and Syntax
|
|
487 @section Motion and Syntax
|
|
488
|
|
489 This section describes functions for moving across characters in
|
|
490 certain syntax classes. None of these functions exists in Emacs
|
|
491 version 18 or earlier.
|
|
492
|
|
493 @defun skip-syntax-forward syntaxes &optional limit buffer
|
|
494 This function moves point forward across characters having syntax classes
|
|
495 mentioned in @var{syntaxes}. It stops when it encounters the end of
|
|
496 the buffer, or position @var{limit} (if specified), or a character it is
|
|
497 not supposed to skip. Optional argument @var{buffer} defaults to the
|
|
498 current buffer if omitted.
|
|
499 @ignore @c may want to change this.
|
|
500 The return value is the distance traveled, which is a nonnegative
|
|
501 integer.
|
|
502 @end ignore
|
|
503 @end defun
|
|
504
|
|
505 @defun skip-syntax-backward syntaxes &optional limit buffer
|
|
506 This function moves point backward across characters whose syntax
|
|
507 classes are mentioned in @var{syntaxes}. It stops when it encounters
|
|
508 the beginning of the buffer, or position @var{limit} (if specified), or a
|
|
509 character it is not supposed to skip. Optional argument @var{buffer}
|
|
510 defaults to the current buffer if omitted.
|
|
511
|
|
512 @ignore @c may want to change this.
|
|
513 The return value indicates the distance traveled. It is an integer that
|
|
514 is zero or less.
|
|
515 @end ignore
|
|
516 @end defun
|
|
517
|
|
518 @defun backward-prefix-chars &optional buffer
|
|
519 This function moves point backward over any number of characters with
|
|
520 expression prefix syntax. This includes both characters in the
|
|
521 expression prefix syntax class, and characters with the @samp{p} flag.
|
|
522 Optional argument @var{buffer} defaults to the current buffer if
|
|
523 omitted.
|
|
524 @end defun
|
|
525
|
|
526 @node Parsing Expressions
|
|
527 @section Parsing Balanced Expressions
|
|
528
|
|
529 Here are several functions for parsing and scanning balanced
|
|
530 expressions, also known as @dfn{sexps}, in which parentheses match in
|
|
531 pairs. The syntax table controls the interpretation of characters, so
|
|
532 these functions can be used for Lisp expressions when in Lisp mode and
|
|
533 for C expressions when in C mode. @xref{List Motion}, for convenient
|
|
534 higher-level functions for moving over balanced expressions.
|
|
535
|
|
536 @defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment buffer
|
|
537 This function parses a sexp in the current buffer starting at
|
|
538 @var{start}, not scanning past @var{limit}. It stops at position
|
|
539 @var{limit} or when certain criteria described below are met, and sets
|
|
540 point to the location where parsing stops. It returns a value
|
|
541 describing the status of the parse at the point where it stops.
|
|
542
|
|
543 If @var{state} is @code{nil}, @var{start} is assumed to be at the top
|
|
544 level of parenthesis structure, such as the beginning of a function
|
|
545 definition. Alternatively, you might wish to resume parsing in the
|
|
546 middle of the structure. To do this, you must provide a @var{state}
|
|
547 argument that describes the initial status of parsing.
|
|
548
|
|
549 @cindex parenthesis depth
|
|
550 If the third argument @var{target-depth} is non-@code{nil}, parsing
|
|
551 stops if the depth in parentheses becomes equal to @var{target-depth}.
|
|
552 The depth starts at 0, or at whatever is given in @var{state}.
|
|
553
|
|
554 If the fourth argument @var{stop-before} is non-@code{nil}, parsing
|
|
555 stops when it comes to any character that starts a sexp. If
|
|
556 @var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
|
|
557 start of a comment.
|
|
558
|
|
559 @cindex parse state
|
|
560 The fifth argument @var{state} is an eight-element list of the same
|
|
561 form as the value of this function, described below. The return value
|
|
562 of one call may be used to initialize the state of the parse on another
|
|
563 call to @code{parse-partial-sexp}.
|
|
564
|
|
565 The result is a list of eight elements describing the final state of
|
|
566 the parse:
|
|
567
|
|
568 @enumerate 0
|
444
|
569 @item
|
0
|
570 The depth in parentheses, counting from 0.
|
|
571
|
444
|
572 @item
|
0
|
573 @cindex innermost containing parentheses
|
|
574 The character position of the start of the innermost parenthetical
|
|
575 grouping containing the stopping point; @code{nil} if none.
|
|
576
|
444
|
577 @item
|
0
|
578 @cindex previous complete subexpression
|
|
579 The character position of the start of the last complete subexpression
|
|
580 terminated; @code{nil} if none.
|
|
581
|
444
|
582 @item
|
0
|
583 @cindex inside string
|
|
584 Non-@code{nil} if inside a string. More precisely, this is the
|
|
585 character that will terminate the string.
|
|
586
|
444
|
587 @item
|
0
|
588 @cindex inside comment
|
|
589 @code{t} if inside a comment (of either style).
|
|
590
|
444
|
591 @item
|
0
|
592 @cindex quote character
|
|
593 @code{t} if point is just after a quote character.
|
|
594
|
444
|
595 @item
|
0
|
596 The minimum parenthesis depth encountered during this scan.
|
|
597
|
|
598 @item
|
|
599 @code{t} if inside a comment of style ``b''.
|
|
600 @end enumerate
|
|
601
|
|
602 Elements 0, 3, 4, 5 and 7 are significant in the argument @var{state}.
|
|
603
|
|
604 @cindex indenting with parentheses
|
|
605 This function is most often used to compute indentation for languages
|
|
606 that have nested parentheses.
|
|
607 @end defun
|
|
608
|
|
609 @defun scan-lists from count depth &optional buffer noerror
|
|
610 This function scans forward @var{count} balanced parenthetical groupings
|
|
611 from character number @var{from}. It returns the character position
|
|
612 where the scan stops.
|
|
613
|
|
614 If @var{depth} is nonzero, parenthesis depth counting begins from that
|
|
615 value. The only candidates for stopping are places where the depth in
|
|
616 parentheses becomes zero; @code{scan-lists} counts @var{count} such
|
|
617 places and then stops. Thus, a positive value for @var{depth} means go
|
|
618 out @var{depth} levels of parenthesis.
|
|
619
|
|
620 Scanning ignores comments if @code{parse-sexp-ignore-comments} is
|
|
621 non-@code{nil}.
|
|
622
|
|
623 If the scan reaches the beginning or end of the buffer (or its
|
|
624 accessible portion), and the depth is not zero, an error is signaled.
|
|
625 If the depth is zero but the count is not used up, @code{nil} is
|
|
626 returned.
|
|
627
|
|
628 If optional arg @var{buffer} is non-@code{nil}, scanning occurs in that
|
|
629 buffer instead of in the current buffer.
|
|
630
|
|
631 If optional arg @var{noerror} is non-@code{nil}, @code{scan-lists}
|
|
632 will return @code{nil} instead of signalling an error.
|
|
633 @end defun
|
|
634
|
|
635 @defun scan-sexps from count &optional buffer noerror
|
|
636 This function scans forward @var{count} sexps from character position
|
|
637 @var{from}. It returns the character position where the scan stops.
|
|
638
|
|
639 Scanning ignores comments if @code{parse-sexp-ignore-comments} is
|
|
640 non-@code{nil}.
|
|
641
|
|
642 If the scan reaches the beginning or end of (the accessible part of) the
|
|
643 buffer in the middle of a parenthetical grouping, an error is signaled.
|
|
644 If it reaches the beginning or end between groupings but before count is
|
|
645 used up, @code{nil} is returned.
|
|
646
|
|
647 If optional arg @var{buffer} is non-@code{nil}, scanning occurs in
|
|
648 that buffer instead of in the current buffer.
|
|
649
|
|
650 If optional arg @var{noerror} is non-@code{nil}, @code{scan-sexps}
|
|
651 will return nil instead of signalling an error.
|
|
652 @end defun
|
|
653
|
|
654 @defvar parse-sexp-ignore-comments
|
|
655 @cindex skipping comments
|
|
656 If the value is non-@code{nil}, then comments are treated as
|
|
657 whitespace by the functions in this section and by @code{forward-sexp}.
|
|
658
|
|
659 In older Emacs versions, this feature worked only when the comment
|
|
660 terminator is something like @samp{*/}, and appears only to end a
|
|
661 comment. In languages where newlines terminate comments, it was
|
|
662 necessary make this variable @code{nil}, since not every newline is the
|
|
663 end of a comment. This limitation no longer exists.
|
|
664 @end defvar
|
|
665
|
|
666 You can use @code{forward-comment} to move forward or backward over
|
|
667 one comment or several comments.
|
|
668
|
446
|
669 @defun forward-comment &optional count buffer
|
0
|
670 This function moves point forward across @var{count} comments (backward,
|
|
671 if @var{count} is negative). If it finds anything other than a comment
|
|
672 or whitespace, it stops, leaving point at the place where it stopped.
|
446
|
673 It also stops after satisfying @var{count}. @var{count} defaults to @code{1}.
|
0
|
674
|
446
|
675 Optional argument @var{buffer} defaults to the current buffer.
|
0
|
676 @end defun
|
|
677
|
|
678 To move forward over all comments and whitespace following point, use
|
|
679 @code{(forward-comment (buffer-size))}. @code{(buffer-size)} is a good
|
|
680 argument to use, because the number of comments in the buffer cannot
|
|
681 exceed that many.
|
|
682
|
|
683 @node Standard Syntax Tables
|
|
684 @section Some Standard Syntax Tables
|
|
685
|
|
686 Most of the major modes in XEmacs have their own syntax tables. Here
|
|
687 are several of them:
|
|
688
|
|
689 @defun standard-syntax-table
|
|
690 This function returns the standard syntax table, which is the syntax
|
|
691 table used in Fundamental mode.
|
|
692 @end defun
|
|
693
|
|
694 @defvar text-mode-syntax-table
|
|
695 The value of this variable is the syntax table used in Text mode.
|
|
696 @end defvar
|
|
697
|
|
698 @defvar c-mode-syntax-table
|
|
699 The value of this variable is the syntax table for C-mode buffers.
|
|
700 @end defvar
|
|
701
|
|
702 @defvar emacs-lisp-mode-syntax-table
|
|
703 The value of this variable is the syntax table used in Emacs Lisp mode
|
|
704 by editing commands. (It has no effect on the Lisp @code{read}
|
|
705 function.)
|
|
706 @end defvar
|
|
707
|
|
708 @node Syntax Table Internals
|
|
709 @section Syntax Table Internals
|
|
710 @cindex syntax table internals
|
|
711
|
|
712 Each element of a syntax table is an integer that encodes the syntax
|
|
713 of one character: the syntax class, possible matching character, and
|
|
714 flags. Lisp programs don't usually work with the elements directly; the
|
|
715 Lisp-level syntax table functions usually work with syntax descriptors
|
|
716 (@pxref{Syntax Descriptors}).
|
|
717
|
|
718 The low 8 bits of each element of a syntax table indicate the
|
|
719 syntax class.
|
|
720
|
|
721 @table @asis
|
|
722 @item @i{Integer}
|
|
723 @i{Class}
|
|
724 @item 0
|
|
725 whitespace
|
|
726 @item 1
|
|
727 punctuation
|
|
728 @item 2
|
|
729 word
|
|
730 @item 3
|
|
731 symbol
|
|
732 @item 4
|
|
733 open parenthesis
|
|
734 @item 5
|
|
735 close parenthesis
|
|
736 @item 6
|
|
737 expression prefix
|
|
738 @item 7
|
|
739 string quote
|
|
740 @item 8
|
|
741 paired delimiter
|
|
742 @item 9
|
|
743 escape
|
|
744 @item 10
|
|
745 character quote
|
|
746 @item 11
|
|
747 comment-start
|
|
748 @item 12
|
|
749 comment-end
|
|
750 @item 13
|
|
751 inherit
|
|
752 @end table
|
|
753
|
|
754 The next 8 bits are the matching opposite parenthesis (if the
|
|
755 character has parenthesis syntax); otherwise, they are not meaningful.
|
|
756 The next 6 bits are the flags.
|