comparison man/lispref/searching.texi @ 316:512e409c26a2 r21-0b56

Import from CVS: tag r21-0b56
author cvs
date Mon, 13 Aug 2007 10:44:46 +0200
parents 341dac730539
children a4f53d9b3154
comparison
equal deleted inserted replaced
315:5e87bc5b1ee4 316:512e409c26a2
275 @c Did perl get this from somewhere? What's the real history of *? ? 275 @c Did perl get this from somewhere? What's the real history of *? ?
276 276
277 This construct is very useful for when you want to match the text inside 277 This construct is very useful for when you want to match the text inside
278 a pair of delimiters. For instance, @samp{/\*.*?\*/} will match C 278 a pair of delimiters. For instance, @samp{/\*.*?\*/} will match C
279 comments in a string. This could not be so elegantly achieved without 279 comments in a string. This could not be so elegantly achieved without
280 the use of a nongreedy quantifier. 280 the use of a non-greedy quantifier.
281 281
282 This construct has not been available prior to XEmacs 20.4. It is not 282 This construct has not been available prior to XEmacs 20.4. It is not
283 available in FSF Emacs. 283 available in FSF Emacs.
284 284
285 @item +? 285 @item +?
463 substring to be recorded for future reference. 463 substring to be recorded for future reference.
464 464
465 This is useful when you need to use a lot of nested grouping @samp{\( 465 This is useful when you need to use a lot of nested grouping @samp{\(
466 @dots{} \)} constructs to express complex alternation, but only want to 466 @dots{} \)} constructs to express complex alternation, but only want to
467 memoize, or capture, one or two of the subexpression matches. Since 467 memoize, or capture, one or two of the subexpression matches. Since
468 @samp{\(?: @dots{} \)} doesn't capture a submatch, it also doesn't need 468 @samp{\(?: @dots{} \)} doesn't capture a sub-match, it also doesn't need
469 to be counted when you count @samp{\( @dots{} \)} groups to figure the 469 to be counted when you count @samp{\( @dots{} \)} groups to figure the
470 @samp{match-string} index. That turns out to be a very convenient 470 @samp{match-string} index. That turns out to be a very convenient
471 characteristic. 471 characteristic.
472 472
473 This situtation occurs where parts of a regular expression have been 473 This situation occurs where parts of a regular expression have been
474 automaticly generated by a program that builds them from lists of 474 automaticly generated by a program that builds them from lists of
475 strings, and the static code following the matching operation must 475 strings, and the static code following the matching operation must
476 access a specific match number. Here's an example that shows this: 476 access a specific match number. Here's an example that shows this.
477 477
478 @example 478 We will assume that @code{(require 'regexp-opt)} has been executed
479 @group 479 already, to ensure that @file{regexp-opt.el}, which is part of the
480 ;; Assume that: 480 @code{xemacs-devel} package, is loaded.
481 (require 'regexp-opt) ;; gets executed at toplevel 481 @ifinfo
482 ;;; `regexp-opt.el' is part of the "xemacs-devel" package. 482 Please evaluate that @code{require} expression now, using @kbd{C-x C-e},
483 ;; ... and that VARNAMES is a list of strings holding the name of some 483 if you intend to try the following example.
484 ;; variables extracted from the program source you are editting and 484 @end ifinfo
485 ;; running this function on. For this example, it will just be bound 485 In a real program, lets pretend that @var{varnames} would be a list of
486 ;; in the let* expression. 486 strings holding the names of some variables extracted somehow from the
487 text of a program source you are editing and running this function on.
488 For the purposes of this illustration, we can just bind it in the
489 @code{let*} expression.
490
491 @example
492 @group
487 (let* ((varnames '("k" "n" "i" "j" "varname")) 493 (let* ((varnames '("k" "n" "i" "j" "varname"))
488 (keys-regexp (regexp-opt 494 (keys-regexp (regexp-opt
489 (mapcar #'symbol-name 495 (mapcar #'symbol-name
490 '(if then else elif 496 '(if then else elif
491 case in of do while 497 case in of do while
492 with for next unless 498 with for next unless
493 cond begin end)))) 499 cond begin end))))
494 (varname-regexp (regexp-opt varnames)) 500 (varname-regexp (regexp-opt varnames))
495 (contrived-regexp (concat "\\(" keys-regexp "\\)" 501 (contrived-regexp (concat "\\(" keys-regexp "\\)"
496 "\\s-(\\s-\\(" 502 "\\s-(\\s-\\("
497 varname-regexp 503 varname-regexp
498 "\\)\\s-)")) 504 "\\)\\s-)"))
499 (keyname "") 505 (keyname "")
500 (varname "")) 506 (varname ""))
501 ;; In the body of this particular defun, we: 507 ;; @r{In the body of this particular defun, we:}
502 (re-search-forward contrived-regexp nil t) 508 (re-search-forward contrived-regexp nil t)
503 ;; ... and it finds a match. Now we want to extract the text that 509 ;; @r{@dots{} and it finds a match. Now we want to extract the}
504 ;; it matched on, and save it into KEYNAME and VARNAME. 510 ;; @r{text that it matched on, and save it into @code{keyname}}
511 ;; @r{and @code{varname}.}
505 (setq keyname (match-string 1) 512 (setq keyname (match-string 1)
506 varname (match-string 2)) 513 varname (match-string 2))
507 ;; ... and then do something with those values. 514 ;; @r{@dots{} and then do something with those values.}
508 (list keyname varname)) 515 (list keyname varname))
509 516
510 ;; Here's something for it to match, so you can try it with `C-x C-e'. 517 ;; @r{Here's something for it to match, so you can try it with}
518 ;; @kbd{C-x C-e}
511 ;; while ( j ) do ... 519 ;; while ( j ) do ...
512 @end group 520 @end group
513 @end example 521 @end example
514 522
515 Here you can see that if the regular expression returned by 523 Here you should see that if the regular expression returned by
516 @samp{regexp-opt} did not use @samp{\(?: @dots{} \)} for grouping, and 524 @code{regexp-opt} did not use @samp{\(?: @dots{} \)} for grouping, and
517 instead used @samp{\( @dots{} \)}, it would be necessary to count the 525 instead used @samp{\( @dots{} \)}, it would be necessary to count the
518 number of opening parentheses in the @samp{keys-regexp} and to use that 526 number of opening parentheses in the @code{keys-regexp} and to use that
519 figure to calculate which match number is matched by the 527 figure to calculate which match number is matched by the
520 @code{varname-regexp}. It is much more convienient to be able to just 528 @code{varname-regexp}. It is much more convenient to be able to just
521 ask for the second match string. 529 ask for the second match string.
522 530
523 @c This is used to good advantage by the font-locking code.... 531 @c This is used to good advantage by the font-locking code....
524 @c ... It will be. It's not yet, but will be. 532 @c ... It will be. It's not yet, but will be.
525 533
1143 This function returns the position of the start of text matched by the 1151 This function returns the position of the start of text matched by the
1144 last regular expression searched for, or a subexpression of it. 1152 last regular expression searched for, or a subexpression of it.
1145 1153
1146 If @var{count} is zero, then the value is the position of the start of 1154 If @var{count} is zero, then the value is the position of the start of
1147 the entire match. Otherwise, @var{count} specifies a subexpression in 1155 the entire match. Otherwise, @var{count} specifies a subexpression in
1148 the regular expresion, and the value of the function is the starting 1156 the regular expression, and the value of the function is the starting
1149 position of the match for that subexpression. 1157 position of the match for that subexpression.
1150 1158
1151 The value is @code{nil} for a subexpression inside a @samp{\|} 1159 The value is @code{nil} for a subexpression inside a @samp{\|}
1152 alternative that wasn't used in the match. 1160 alternative that wasn't used in the match.
1153 @end defun 1161 @end defun