153f7fcccSBram Moolenaar*pattern.txt* For Vim version 8.2. Last change: 2021 Jul 16 2071d4279SBram Moolenaar 3071d4279SBram Moolenaar 4071d4279SBram Moolenaar VIM REFERENCE MANUAL by Bram Moolenaar 5071d4279SBram Moolenaar 6071d4279SBram Moolenaar 7071d4279SBram MoolenaarPatterns and search commands *pattern-searches* 8071d4279SBram Moolenaar 9071d4279SBram MoolenaarThe very basics can be found in section |03.9| of the user manual. A few more 10071d4279SBram Moolenaarexplanations are in chapter 27 |usr_27.txt|. 11071d4279SBram Moolenaar 12071d4279SBram Moolenaar1. Search commands |search-commands| 13071d4279SBram Moolenaar2. The definition of a pattern |search-pattern| 14071d4279SBram Moolenaar3. Magic |/magic| 15071d4279SBram Moolenaar4. Overview of pattern items |pattern-overview| 16071d4279SBram Moolenaar5. Multi items |pattern-multi-items| 17071d4279SBram Moolenaar6. Ordinary atoms |pattern-atoms| 18071d4279SBram Moolenaar7. Ignoring case in a pattern |/ignorecase| 19362e1a30SBram Moolenaar8. Composing characters |patterns-composing| 20362e1a30SBram Moolenaar9. Compare with Perl patterns |perl-patterns| 21362e1a30SBram Moolenaar10. Highlighting matches |match-highlight| 223ec3217fSBram Moolenaar11. Fuzzy matching |fuzzy-match| 23071d4279SBram Moolenaar 24071d4279SBram Moolenaar============================================================================== 251514667aSBram Moolenaar1. Search commands *search-commands* 26071d4279SBram Moolenaar 27071d4279SBram Moolenaar */* 28071d4279SBram Moolenaar/{pattern}[/]<CR> Search forward for the [count]'th occurrence of 29071d4279SBram Moolenaar {pattern} |exclusive|. 30071d4279SBram Moolenaar 31071d4279SBram Moolenaar/{pattern}/{offset}<CR> Search forward for the [count]'th occurrence of 32071d4279SBram Moolenaar {pattern} and go |{offset}| lines up or down. 33071d4279SBram Moolenaar |linewise|. 34071d4279SBram Moolenaar 35071d4279SBram Moolenaar */<CR>* 368f3f58f2SBram Moolenaar/<CR> Search forward for the [count]'th occurrence of the 378f3f58f2SBram Moolenaar latest used pattern |last-pattern| with latest used 388f3f58f2SBram Moolenaar |{offset}|. 39071d4279SBram Moolenaar 408f3f58f2SBram Moolenaar//{offset}<CR> Search forward for the [count]'th occurrence of the 418f3f58f2SBram Moolenaar latest used pattern |last-pattern| with new 428f3f58f2SBram Moolenaar |{offset}|. If {offset} is empty no offset is used. 43071d4279SBram Moolenaar 44071d4279SBram Moolenaar *?* 45071d4279SBram Moolenaar?{pattern}[?]<CR> Search backward for the [count]'th previous 46071d4279SBram Moolenaar occurrence of {pattern} |exclusive|. 47071d4279SBram Moolenaar 48071d4279SBram Moolenaar?{pattern}?{offset}<CR> Search backward for the [count]'th previous 49071d4279SBram Moolenaar occurrence of {pattern} and go |{offset}| lines up or 50071d4279SBram Moolenaar down |linewise|. 51071d4279SBram Moolenaar 52071d4279SBram Moolenaar *?<CR>* 538f3f58f2SBram Moolenaar?<CR> Search backward for the [count]'th occurrence of the 548f3f58f2SBram Moolenaar latest used pattern |last-pattern| with latest used 558f3f58f2SBram Moolenaar |{offset}|. 56071d4279SBram Moolenaar 578f3f58f2SBram Moolenaar??{offset}<CR> Search backward for the [count]'th occurrence of the 588f3f58f2SBram Moolenaar latest used pattern |last-pattern| with new 598f3f58f2SBram Moolenaar |{offset}|. If {offset} is empty no offset is used. 60071d4279SBram Moolenaar 61071d4279SBram Moolenaar *n* 62071d4279SBram Moolenaarn Repeat the latest "/" or "?" [count] times. 632b8388bdSBram Moolenaar If the cursor doesn't move the search is repeated with 642b8388bdSBram Moolenaar count + 1. 6568e6560bSBram Moolenaar |last-pattern| 66071d4279SBram Moolenaar 67071d4279SBram Moolenaar *N* 68071d4279SBram MoolenaarN Repeat the latest "/" or "?" [count] times in 6968e6560bSBram Moolenaar opposite direction. |last-pattern| 70071d4279SBram Moolenaar 71071d4279SBram Moolenaar *star* *E348* *E349* 72071d4279SBram Moolenaar* Search forward for the [count]'th occurrence of the 73071d4279SBram Moolenaar word nearest to the cursor. The word used for the 74071d4279SBram Moolenaar search is the first of: 75071d4279SBram Moolenaar 1. the keyword under the cursor |'iskeyword'| 76071d4279SBram Moolenaar 2. the first keyword after the cursor, in the 77071d4279SBram Moolenaar current line 78071d4279SBram Moolenaar 3. the non-blank word under the cursor 79071d4279SBram Moolenaar 4. the first non-blank word after the cursor, 80071d4279SBram Moolenaar in the current line 81071d4279SBram Moolenaar Only whole keywords are searched for, like with the 8225c9c680SBram Moolenaar command "/\<keyword\>". |exclusive| 83071d4279SBram Moolenaar 'ignorecase' is used, 'smartcase' is not. 84071d4279SBram Moolenaar 85071d4279SBram Moolenaar *#* 86071d4279SBram Moolenaar# Same as "*", but search backward. The pound sign 87071d4279SBram Moolenaar (character 163) also works. If the "#" key works as 88071d4279SBram Moolenaar backspace, try using "stty erase <BS>" before starting 8925c9c680SBram Moolenaar Vim (<BS> is CTRL-H or a real backspace). 90071d4279SBram Moolenaar 91071d4279SBram Moolenaar *gstar* 92071d4279SBram Moolenaarg* Like "*", but don't put "\<" and "\>" around the word. 93071d4279SBram Moolenaar This makes the search also find matches that are not a 9425c9c680SBram Moolenaar whole word. 95071d4279SBram Moolenaar 96071d4279SBram Moolenaar *g#* 97071d4279SBram Moolenaarg# Like "#", but don't put "\<" and "\>" around the word. 98071d4279SBram Moolenaar This makes the search also find matches that are not a 9925c9c680SBram Moolenaar whole word. 100071d4279SBram Moolenaar 101071d4279SBram Moolenaar *gd* 102071d4279SBram Moolenaargd Goto local Declaration. When the cursor is on a local 103071d4279SBram Moolenaar variable, this command will jump to its declaration. 104071d4279SBram Moolenaar First Vim searches for the start of the current 105071d4279SBram Moolenaar function, just like "[[". If it is not found the 106071d4279SBram Moolenaar search stops in line 1. If it is found, Vim goes back 107071d4279SBram Moolenaar until a blank line is found. From this position Vim 108071d4279SBram Moolenaar searches for the keyword under the cursor, like with 109071d4279SBram Moolenaar "*", but lines that look like a comment are ignored 110071d4279SBram Moolenaar (see 'comments' option). 111071d4279SBram Moolenaar Note that this is not guaranteed to work, Vim does not 112071d4279SBram Moolenaar really check the syntax, it only searches for a match 113071d4279SBram Moolenaar with the keyword. If included files also need to be 114071d4279SBram Moolenaar searched use the commands listed in |include-search|. 115071d4279SBram Moolenaar After this command |n| searches forward for the next 116071d4279SBram Moolenaar match (not backward). 117071d4279SBram Moolenaar 118071d4279SBram Moolenaar *gD* 119071d4279SBram MoolenaargD Goto global Declaration. When the cursor is on a 120071d4279SBram Moolenaar global variable that is defined in the file, this 121071d4279SBram Moolenaar command will jump to its declaration. This works just 122071d4279SBram Moolenaar like "gd", except that the search for the keyword 12325c9c680SBram Moolenaar always starts in line 1. 124071d4279SBram Moolenaar 125f75a963eSBram Moolenaar *1gd* 126f75a963eSBram Moolenaar1gd Like "gd", but ignore matches inside a {} block that 12725c9c680SBram Moolenaar ends before the cursor position. 128f75a963eSBram Moolenaar 129f75a963eSBram Moolenaar *1gD* 130f75a963eSBram Moolenaar1gD Like "gD", but ignore matches inside a {} block that 13125c9c680SBram Moolenaar ends before the cursor position. 132f75a963eSBram Moolenaar 133071d4279SBram Moolenaar *CTRL-C* 134071d4279SBram MoolenaarCTRL-C Interrupt current (search) command. Use CTRL-Break on 1355666fcd0SBram Moolenaar MS-Windows |dos-CTRL-Break|. 136071d4279SBram Moolenaar In Normal mode, any pending command is aborted. 137071d4279SBram Moolenaar 138071d4279SBram Moolenaar *:noh* *:nohlsearch* 139071d4279SBram Moolenaar:noh[lsearch] Stop the highlighting for the 'hlsearch' option. It 140071d4279SBram Moolenaar is automatically turned back on when using a search 141071d4279SBram Moolenaar command, or setting the 'hlsearch' option. 142071d4279SBram Moolenaar This command doesn't work in an autocommand, because 143071d4279SBram Moolenaar the highlighting state is saved and restored when 144071d4279SBram Moolenaar executing autocommands |autocmd-searchpat|. 1453577c6faSBram Moolenaar Same thing for when invoking a user function. 146071d4279SBram Moolenaar 147071d4279SBram MoolenaarWhile typing the search pattern the current match will be shown if the 148071d4279SBram Moolenaar'incsearch' option is on. Remember that you still have to finish the search 149071d4279SBram Moolenaarcommand with <CR> to actually position the cursor at the displayed match. Or 150071d4279SBram Moolenaaruse <Esc> to abandon the search. 151071d4279SBram Moolenaar 152071d4279SBram MoolenaarAll matches for the last used search pattern will be highlighted if you set 153071d4279SBram Moolenaarthe 'hlsearch' option. This can be suspended with the |:nohlsearch| command. 154071d4279SBram Moolenaar 1559dfa3139SBram MoolenaarWhen 'shortmess' does not include the "S" flag, Vim will automatically show an 1569dfa3139SBram Moolenaarindex, on which the cursor is. This can look like this: > 1579dfa3139SBram Moolenaar 1589dfa3139SBram Moolenaar [1/5] Cursor is on first of 5 matches. 1599dfa3139SBram Moolenaar [1/>99] Cursor is on first of more than 99 matches. 1609dfa3139SBram Moolenaar [>99/>99] Cursor is after 99 match of more than 99 matches. 1619dfa3139SBram Moolenaar [?/??] Unknown how many matches exists, generating the 1629dfa3139SBram Moolenaar statistics was aborted because of search timeout. 1639dfa3139SBram Moolenaar 1649dfa3139SBram MoolenaarNote: the count does not take offset into account. 1659dfa3139SBram Moolenaar 1661514667aSBram MoolenaarWhen no match is found you get the error: *E486* Pattern not found 1671514667aSBram MoolenaarNote that for the |:global| command this behaves like a normal message, for Vi 1681514667aSBram Moolenaarcompatibility. For the |:s| command the "e" flag can be used to avoid the 1691514667aSBram Moolenaarerror message |:s_flags|. 1701514667aSBram Moolenaar 171071d4279SBram Moolenaar *search-offset* *{offset}* 172071d4279SBram MoolenaarThese commands search for the specified pattern. With "/" and "?" an 173071d4279SBram Moolenaaradditional offset may be given. There are two types of offsets: line offsets 17425c9c680SBram Moolenaarand character offsets. 175071d4279SBram Moolenaar 176071d4279SBram MoolenaarThe offset gives the cursor position relative to the found match: 177071d4279SBram Moolenaar [num] [num] lines downwards, in column 1 178071d4279SBram Moolenaar +[num] [num] lines downwards, in column 1 179071d4279SBram Moolenaar -[num] [num] lines upwards, in column 1 180071d4279SBram Moolenaar e[+num] [num] characters to the right of the end of the match 181071d4279SBram Moolenaar e[-num] [num] characters to the left of the end of the match 182071d4279SBram Moolenaar s[+num] [num] characters to the right of the start of the match 183071d4279SBram Moolenaar s[-num] [num] characters to the left of the start of the match 184071d4279SBram Moolenaar b[+num] [num] identical to s[+num] above (mnemonic: begin) 185071d4279SBram Moolenaar b[-num] [num] identical to s[-num] above (mnemonic: begin) 1861d2ba7faSBram Moolenaar ;{pattern} perform another search, see |//;| 187071d4279SBram Moolenaar 188071d4279SBram MoolenaarIf a '-' or '+' is given but [num] is omitted, a count of one will be used. 189071d4279SBram MoolenaarWhen including an offset with 'e', the search becomes inclusive (the 190071d4279SBram Moolenaarcharacter the cursor lands on is included in operations). 191071d4279SBram Moolenaar 192071d4279SBram MoolenaarExamples: 193071d4279SBram Moolenaar 194071d4279SBram Moolenaarpattern cursor position ~ 195071d4279SBram Moolenaar/test/+1 one line below "test", in column 1 196071d4279SBram Moolenaar/test/e on the last t of "test" 197071d4279SBram Moolenaar/test/s+2 on the 's' of "test" 198071d4279SBram Moolenaar/test/b-3 three characters before "test" 199071d4279SBram Moolenaar 200071d4279SBram MoolenaarIf one of these commands is used after an operator, the characters between 201071d4279SBram Moolenaarthe cursor position before and after the search is affected. However, if a 202071d4279SBram Moolenaarline offset is given, the whole lines between the two cursor positions are 203071d4279SBram Moolenaaraffected. 204071d4279SBram Moolenaar 205071d4279SBram MoolenaarAn example of how to search for matches with a pattern and change the match 206071d4279SBram Moolenaarwith another word: > 207071d4279SBram Moolenaar /foo<CR> find "foo" 20892dff182SBram Moolenaar c//e<CR> change until end of match 209071d4279SBram Moolenaar bar<Esc> type replacement 210071d4279SBram Moolenaar //<CR> go to start of next match 21192dff182SBram Moolenaar c//e<CR> change until end of match 212071d4279SBram Moolenaar beep<Esc> type another replacement 213071d4279SBram Moolenaar etc. 214071d4279SBram Moolenaar< 215071d4279SBram Moolenaar *//;* *E386* 216071d4279SBram MoolenaarA very special offset is ';' followed by another search command. For example: > 217071d4279SBram Moolenaar 218071d4279SBram Moolenaar /test 1/;/test 219071d4279SBram Moolenaar /test.*/+1;?ing? 220071d4279SBram Moolenaar 221071d4279SBram MoolenaarThe first one first finds the next occurrence of "test 1", and then the first 222071d4279SBram Moolenaaroccurrence of "test" after that. 223071d4279SBram Moolenaar 224071d4279SBram MoolenaarThis is like executing two search commands after each other, except that: 225071d4279SBram Moolenaar- It can be used as a single motion command after an operator. 226071d4279SBram Moolenaar- The direction for a following "n" or "N" command comes from the first 227071d4279SBram Moolenaar search command. 228071d4279SBram Moolenaar- When an error occurs the cursor is not moved at all. 229071d4279SBram Moolenaar 230071d4279SBram Moolenaar *last-pattern* 231071d4279SBram MoolenaarThe last used pattern and offset are remembered. They can be used to repeat 232071d4279SBram Moolenaarthe search, possibly in another direction or with another count. Note that 2339faec4e3SBram Moolenaartwo patterns are remembered: One for "normal" search commands and one for the 234071d4279SBram Moolenaarsubstitute command ":s". Each time an empty pattern is given, the previously 235662db673SBram Moolenaarused pattern is used. However, if there is no previous search command, a 236662db673SBram Moolenaarprevious substitute pattern is used, if possible. 237071d4279SBram Moolenaar 238071d4279SBram MoolenaarThe 'magic' option sticks with the last used pattern. If you change 'magic', 239071d4279SBram Moolenaarthis will not change how the last used pattern will be interpreted. 240071d4279SBram MoolenaarThe 'ignorecase' option does not do this. When 'ignorecase' is changed, it 241071d4279SBram Moolenaarwill result in the pattern to match other text. 242071d4279SBram Moolenaar 243071d4279SBram MoolenaarAll matches for the last used search pattern will be highlighted if you set 244071d4279SBram Moolenaarthe 'hlsearch' option. 245071d4279SBram Moolenaar 246071d4279SBram MoolenaarTo clear the last used search pattern: > 247071d4279SBram Moolenaar :let @/ = "" 248071d4279SBram MoolenaarThis will not set the pattern to an empty string, because that would match 249071d4279SBram Moolenaareverywhere. The pattern is really cleared, like when starting Vim. 250071d4279SBram Moolenaar 2518f999f19SBram MoolenaarThe search usually skips matches that don't move the cursor. Whether the next 252071d4279SBram Moolenaarmatch is found at the next character or after the skipped match depends on the 253071d4279SBram Moolenaar'c' flag in 'cpoptions'. See |cpo-c|. 254071d4279SBram Moolenaar with 'c' flag: "/..." advances 1 to 3 characters 255071d4279SBram Moolenaar without 'c' flag: "/..." advances 1 character 256071d4279SBram MoolenaarThe unpredictability with the 'c' flag is caused by starting the search in the 257071d4279SBram Moolenaarfirst column, skipping matches until one is found past the cursor position. 258071d4279SBram Moolenaar 2598f999f19SBram MoolenaarWhen searching backwards, searching starts at the start of the line, using the 2608f999f19SBram Moolenaar'c' flag in 'cpoptions' as described above. Then the last match before the 2618f999f19SBram Moolenaarcursor position is used. 2628f999f19SBram Moolenaar 263071d4279SBram MoolenaarIn Vi the ":tag" command sets the last search pattern when the tag is searched 264071d4279SBram Moolenaarfor. In Vim this is not done, the previous search pattern is still remembered, 265071d4279SBram Moolenaarunless the 't' flag is present in 'cpoptions'. The search pattern is always 266071d4279SBram Moolenaarput in the search history. 267071d4279SBram Moolenaar 268071d4279SBram MoolenaarIf the 'wrapscan' option is on (which is the default), searches wrap around 269071d4279SBram Moolenaarthe end of the buffer. If 'wrapscan' is not set, the backward search stops 270071d4279SBram Moolenaarat the beginning and the forward search stops at the end of the buffer. If 271071d4279SBram Moolenaar'wrapscan' is set and the pattern was not found the error message "pattern 272071d4279SBram Moolenaarnot found" is given, and the cursor will not be moved. If 'wrapscan' is not 273071d4279SBram Moolenaarset the message becomes "search hit BOTTOM without match" when searching 274071d4279SBram Moolenaarforward, or "search hit TOP without match" when searching backward. If 275071d4279SBram Moolenaarwrapscan is set and the search wraps around the end of the file the message 276071d4279SBram Moolenaar"search hit TOP, continuing at BOTTOM" or "search hit BOTTOM, continuing at 277071d4279SBram MoolenaarTOP" is given when searching backwards or forwards respectively. This can be 278071d4279SBram Moolenaarswitched off by setting the 's' flag in the 'shortmess' option. The highlight 279071d4279SBram Moolenaarmethod 'w' is used for this message (default: standout). 280071d4279SBram Moolenaar 281071d4279SBram Moolenaar *search-range* 2824770d09aSBram MoolenaarYou can limit the search command "/" to a certain range of lines by including 2834770d09aSBram Moolenaar\%>l items. For example, to match the word "limit" below line 199 and above 2844770d09aSBram Moolenaarline 300: > 2854770d09aSBram Moolenaar /\%>199l\%<300llimit 2864770d09aSBram MoolenaarAlso see |/\%>l|. 2874770d09aSBram Moolenaar 2884770d09aSBram MoolenaarAnother way is to use the ":substitute" command with the 'c' flag. Example: > 289071d4279SBram Moolenaar :.,300s/Pattern//gc 290071d4279SBram MoolenaarThis command will search from the cursor position until line 300 for 291071d4279SBram Moolenaar"Pattern". At the match, you will be asked to type a character. Type 'q' to 292071d4279SBram Moolenaarstop at this match, type 'n' to find the next match. 293071d4279SBram Moolenaar 294071d4279SBram MoolenaarThe "*", "#", "g*" and "g#" commands look for a word near the cursor in this 295071d4279SBram Moolenaarorder, the first one that is found is used: 296071d4279SBram Moolenaar- The keyword currently under the cursor. 297071d4279SBram Moolenaar- The first keyword to the right of the cursor, in the same line. 298071d4279SBram Moolenaar- The WORD currently under the cursor. 299071d4279SBram Moolenaar- The first WORD to the right of the cursor, in the same line. 300071d4279SBram MoolenaarThe keyword may only contain letters and characters in 'iskeyword'. 301071d4279SBram MoolenaarThe WORD may contain any non-blanks (<Tab>s and/or <Space>s). 302071d4279SBram MoolenaarNote that if you type with ten fingers, the characters are easy to remember: 303071d4279SBram Moolenaarthe "#" is under your left hand middle finger (search to the left and up) and 304071d4279SBram Moolenaarthe "*" is under your right hand middle finger (search to the right and down). 305071d4279SBram Moolenaar(this depends on your keyboard layout though). 306071d4279SBram Moolenaar 307a9604e61SBram Moolenaar *E956* 308a9604e61SBram MoolenaarIn very rare cases a regular expression is used recursively. This can happen 309f0d58efcSBram Moolenaarwhen executing a pattern takes a long time and when checking for messages on 310a9604e61SBram Moolenaarchannels a callback is invoked that also uses a pattern or an autocommand is 311a9604e61SBram Moolenaartriggered. In most cases this should be fine, but if a pattern is in use when 312a9604e61SBram Moolenaarit's used again it fails. Usually this means there is something wrong with 313a9604e61SBram Moolenaarthe pattern. 314a9604e61SBram Moolenaar 315071d4279SBram Moolenaar============================================================================== 316071d4279SBram Moolenaar2. The definition of a pattern *search-pattern* *pattern* *[pattern]* 317071d4279SBram Moolenaar *regular-expression* *regexp* *Pattern* 318f1f8bc5bSBram Moolenaar *E76* *E383* *E476* 319071d4279SBram Moolenaar 320071d4279SBram MoolenaarFor starters, read chapter 27 of the user manual |usr_27.txt|. 321071d4279SBram Moolenaar 322071d4279SBram Moolenaar */bar* */\bar* */pattern* 323071d4279SBram Moolenaar1. A pattern is one or more branches, separated by "\|". It matches anything 324071d4279SBram Moolenaar that matches one of the branches. Example: "foo\|beep" matches "foo" and 325071d4279SBram Moolenaar matches "beep". If more than one branch matches, the first one is used. 326071d4279SBram Moolenaar 327071d4279SBram Moolenaar pattern ::= branch 328071d4279SBram Moolenaar or branch \| branch 329071d4279SBram Moolenaar or branch \| branch \| branch 330071d4279SBram Moolenaar etc. 331071d4279SBram Moolenaar 332071d4279SBram Moolenaar */branch* */\&* 333071d4279SBram Moolenaar2. A branch is one or more concats, separated by "\&". It matches the last 334071d4279SBram Moolenaar concat, but only if all the preceding concats also match at the same 335071d4279SBram Moolenaar position. Examples: 336071d4279SBram Moolenaar "foobeep\&..." matches "foo" in "foobeep". 337071d4279SBram Moolenaar ".*Peter\&.*Bob" matches in a line containing both "Peter" and "Bob" 338071d4279SBram Moolenaar 339071d4279SBram Moolenaar branch ::= concat 340071d4279SBram Moolenaar or concat \& concat 341071d4279SBram Moolenaar or concat \& concat \& concat 342071d4279SBram Moolenaar etc. 343071d4279SBram Moolenaar 344071d4279SBram Moolenaar */concat* 345071d4279SBram Moolenaar3. A concat is one or more pieces, concatenated. It matches a match for the 346071d4279SBram Moolenaar first piece, followed by a match for the second piece, etc. Example: 347071d4279SBram Moolenaar "f[0-9]b", first matches "f", then a digit and then "b". 348071d4279SBram Moolenaar 349071d4279SBram Moolenaar concat ::= piece 350071d4279SBram Moolenaar or piece piece 351071d4279SBram Moolenaar or piece piece piece 352071d4279SBram Moolenaar etc. 353071d4279SBram Moolenaar 354071d4279SBram Moolenaar */piece* 355071d4279SBram Moolenaar4. A piece is an atom, possibly followed by a multi, an indication of how many 356071d4279SBram Moolenaar times the atom can be matched. Example: "a*" matches any sequence of "a" 357071d4279SBram Moolenaar characters: "", "a", "aa", etc. See |/multi|. 358071d4279SBram Moolenaar 359071d4279SBram Moolenaar piece ::= atom 360071d4279SBram Moolenaar or atom multi 361071d4279SBram Moolenaar 362071d4279SBram Moolenaar */atom* 363071d4279SBram Moolenaar5. An atom can be one of a long list of items. Many atoms match one character 364071d4279SBram Moolenaar in the text. It is often an ordinary character or a character class. 3651b884a00SBram Moolenaar Parentheses can be used to make a pattern into an atom. The "\z(\)" 3661b884a00SBram Moolenaar construct is only for syntax highlighting. 367071d4279SBram Moolenaar 368071d4279SBram Moolenaar atom ::= ordinary-atom |/ordinary-atom| 369071d4279SBram Moolenaar or \( pattern \) |/\(| 370071d4279SBram Moolenaar or \%( pattern \) |/\%(| 371071d4279SBram Moolenaar or \z( pattern \) |/\z(| 372071d4279SBram Moolenaar 373071d4279SBram Moolenaar 374913df81eSBram Moolenaar */\%#=* *two-engines* *NFA* 375fbc0d2eaSBram MoolenaarVim includes two regexp engines: 376fbc0d2eaSBram Moolenaar1. An old, backtracking engine that supports everything. 377220adb1eSBram Moolenaar2. A new, NFA engine that works much faster on some patterns, possibly slower 378220adb1eSBram Moolenaar on some patterns. 379fbc0d2eaSBram Moolenaar 380fbc0d2eaSBram MoolenaarVim will automatically select the right engine for you. However, if you run 381fbc0d2eaSBram Moolenaarinto a problem or want to specifically select one engine or the other, you can 382fbc0d2eaSBram Moolenaarprepend one of the following to the pattern: 383fbc0d2eaSBram Moolenaar 384fbc0d2eaSBram Moolenaar \%#=0 Force automatic selection. Only has an effect when 385fbc0d2eaSBram Moolenaar 'regexpengine' has been set to a non-zero value. 386fbc0d2eaSBram Moolenaar \%#=1 Force using the old engine. 387fbc0d2eaSBram Moolenaar \%#=2 Force using the NFA engine. 388fbc0d2eaSBram Moolenaar 389fbc0d2eaSBram MoolenaarYou can also use the 'regexpengine' option to change the default. 390fbc0d2eaSBram Moolenaar 391fbc0d2eaSBram Moolenaar *E864* *E868* *E874* *E875* *E876* *E877* *E878* 392fbc0d2eaSBram MoolenaarIf selecting the NFA engine and it runs into something that is not implemented 393fbc0d2eaSBram Moolenaarthe pattern will not match. This is only useful when debugging Vim. 394fbc0d2eaSBram Moolenaar 395071d4279SBram Moolenaar============================================================================== 396eb3593b3SBram Moolenaar3. Magic */magic* 397eb3593b3SBram Moolenaar 3987e6a515eSBram MoolenaarSome characters in the pattern, such as letters, are taken literally. They 3997e6a515eSBram Moolenaarmatch exactly the same character in the text. When preceded with a backslash 4007e6a515eSBram Moolenaarhowever, these characters may get a special meaning. For example, "a" matches 4017e6a515eSBram Moolenaarthe letter "a", while "\a" matches any alphabetic character. 402eb3593b3SBram Moolenaar 403eb3593b3SBram MoolenaarOther characters have a special meaning without a backslash. They need to be 4047e6a515eSBram Moolenaarpreceded with a backslash to match literally. For example "." matches any 4057e6a515eSBram Moolenaarcharacter while "\." matches a dot. 406eb3593b3SBram Moolenaar 407eb3593b3SBram MoolenaarIf a character is taken literally or not depends on the 'magic' option and the 4087e6a515eSBram Moolenaaritems in the pattern mentioned next. The 'magic' option should always be set, 4097e6a515eSBram Moolenaarbut it can be switched off for Vi compatibility. We mention the effect of 4107e6a515eSBram Moolenaar'nomagic' here for completeness, but we recommend against using that. 411eb3593b3SBram Moolenaar */\m* */\M* 412eb3593b3SBram MoolenaarUse of "\m" makes the pattern after it be interpreted as if 'magic' is set, 413eb3593b3SBram Moolenaarignoring the actual value of the 'magic' option. 414eb3593b3SBram MoolenaarUse of "\M" makes the pattern after it be interpreted as if 'nomagic' is used. 415eb3593b3SBram Moolenaar */\v* */\V* 416c8c88492SBram MoolenaarUse of "\v" means that after it, all ASCII characters except '0'-'9', 'a'-'z', 417c8c88492SBram Moolenaar'A'-'Z' and '_' have special meaning: "very magic" 418eb3593b3SBram Moolenaar 4197e6a515eSBram MoolenaarUse of "\V" means that after it, only a backslash and the terminating 4207e6a515eSBram Moolenaarcharacter (usually / or ?) have special meaning: "very nomagic" 421eb3593b3SBram Moolenaar 422eb3593b3SBram MoolenaarExamples: 423eb3593b3SBram Moolenaarafter: \v \m \M \V matches ~ 424eb3593b3SBram Moolenaar 'magic' 'nomagic' 4257e6a515eSBram Moolenaar a a a a literal 'a' 4267e6a515eSBram Moolenaar \a \a \a \a any alphabetic character 4277e6a515eSBram Moolenaar . . \. \. any character 4287e6a515eSBram Moolenaar \. \. . . literal dot 4297e6a515eSBram Moolenaar $ $ $ \$ end-of-line 430eb3593b3SBram Moolenaar * * \* \* any number of the previous atom 431256972a9SBram Moolenaar ~ ~ \~ \~ latest substitute string 4327e6a515eSBram Moolenaar () \(\) \(\) \(\) group as an atom 4337e6a515eSBram Moolenaar | \| \| \| nothing: separates alternatives 434eb3593b3SBram Moolenaar \\ \\ \\ \\ literal backslash 4357e6a515eSBram Moolenaar \{ { { { literal curly brace 436eb3593b3SBram Moolenaar 437eb3593b3SBram Moolenaar{only Vim supports \m, \M, \v and \V} 438eb3593b3SBram Moolenaar 4397e6a515eSBram MoolenaarIf you want to you can make a pattern immune to the 'magic' option being set 4407e6a515eSBram Moolenaaror not by putting "\m" or "\M" at the start of the pattern. 441eb3593b3SBram Moolenaar 442eb3593b3SBram Moolenaar============================================================================== 443071d4279SBram Moolenaar4. Overview of pattern items *pattern-overview* 444fbc0d2eaSBram Moolenaar *E865* *E866* *E867* *E869* 445071d4279SBram Moolenaar 446071d4279SBram MoolenaarOverview of multi items. */multi* *E61* *E62* 447fbc0d2eaSBram MoolenaarMore explanation and examples below, follow the links. *E64* *E871* 448071d4279SBram Moolenaar 449071d4279SBram Moolenaar multi ~ 450071d4279SBram Moolenaar 'magic' 'nomagic' matches of the preceding atom ~ 451071d4279SBram Moolenaar|/star| * \* 0 or more as many as possible 45225c9c680SBram Moolenaar|/\+| \+ \+ 1 or more as many as possible 45325c9c680SBram Moolenaar|/\=| \= \= 0 or 1 as many as possible 45425c9c680SBram Moolenaar|/\?| \? \? 0 or 1 as many as possible 455071d4279SBram Moolenaar 45625c9c680SBram Moolenaar|/\{| \{n,m} \{n,m} n to m as many as possible 45725c9c680SBram Moolenaar \{n} \{n} n exactly 45825c9c680SBram Moolenaar \{n,} \{n,} at least n as many as possible 45925c9c680SBram Moolenaar \{,m} \{,m} 0 to m as many as possible 46025c9c680SBram Moolenaar \{} \{} 0 or more as many as possible (same as *) 461071d4279SBram Moolenaar 46225c9c680SBram Moolenaar|/\{-| \{-n,m} \{-n,m} n to m as few as possible 46325c9c680SBram Moolenaar \{-n} \{-n} n exactly 46425c9c680SBram Moolenaar \{-n,} \{-n,} at least n as few as possible 46525c9c680SBram Moolenaar \{-,m} \{-,m} 0 to m as few as possible 46625c9c680SBram Moolenaar \{-} \{-} 0 or more as few as possible 467071d4279SBram Moolenaar 468071d4279SBram Moolenaar *E59* 46925c9c680SBram Moolenaar|/\@>| \@> \@> 1, like matching a whole pattern 47025c9c680SBram Moolenaar|/\@=| \@= \@= nothing, requires a match |/zero-width| 47125c9c680SBram Moolenaar|/\@!| \@! \@! nothing, requires NO match |/zero-width| 47225c9c680SBram Moolenaar|/\@<=| \@<= \@<= nothing, requires a match behind |/zero-width| 47325c9c680SBram Moolenaar|/\@<!| \@<! \@<! nothing, requires NO match behind |/zero-width| 474071d4279SBram Moolenaar 475071d4279SBram Moolenaar 476071d4279SBram MoolenaarOverview of ordinary atoms. */ordinary-atom* 477071d4279SBram MoolenaarMore explanation and examples below, follow the links. 478071d4279SBram Moolenaar 479071d4279SBram Moolenaar ordinary atom ~ 480071d4279SBram Moolenaar magic nomagic matches ~ 481071d4279SBram Moolenaar|/^| ^ ^ start-of-line (at start of pattern) |/zero-width| 482071d4279SBram Moolenaar|/\^| \^ \^ literal '^' 483071d4279SBram Moolenaar|/\_^| \_^ \_^ start-of-line (used anywhere) |/zero-width| 484071d4279SBram Moolenaar|/$| $ $ end-of-line (at end of pattern) |/zero-width| 485071d4279SBram Moolenaar|/\$| \$ \$ literal '$' 486071d4279SBram Moolenaar|/\_$| \_$ \_$ end-of-line (used anywhere) |/zero-width| 487071d4279SBram Moolenaar|/.| . \. any single character (not an end-of-line) 488071d4279SBram Moolenaar|/\_.| \_. \_. any single character or end-of-line 489071d4279SBram Moolenaar|/\<| \< \< beginning of a word |/zero-width| 490071d4279SBram Moolenaar|/\>| \> \> end of a word |/zero-width| 491071d4279SBram Moolenaar|/\zs| \zs \zs anything, sets start of match 492071d4279SBram Moolenaar|/\ze| \ze \ze anything, sets end of match 493071d4279SBram Moolenaar|/\%^| \%^ \%^ beginning of file |/zero-width| *E71* 494071d4279SBram Moolenaar|/\%$| \%$ \%$ end of file |/zero-width| 49533aec765SBram Moolenaar|/\%V| \%V \%V inside Visual area |/zero-width| 496071d4279SBram Moolenaar|/\%#| \%# \%# cursor position |/zero-width| 49733aec765SBram Moolenaar|/\%'m| \%'m \%'m mark m position |/zero-width| 498071d4279SBram Moolenaar|/\%l| \%23l \%23l in line 23 |/zero-width| 499071d4279SBram Moolenaar|/\%c| \%23c \%23c in column 23 |/zero-width| 500071d4279SBram Moolenaar|/\%v| \%23v \%23v in virtual column 23 |/zero-width| 501071d4279SBram Moolenaar 50225c9c680SBram MoolenaarCharacter classes: */character-classes* 503256972a9SBram Moolenaar magic nomagic matches ~ 504071d4279SBram Moolenaar|/\i| \i \i identifier character (see 'isident' option) 505071d4279SBram Moolenaar|/\I| \I \I like "\i", but excluding digits 506071d4279SBram Moolenaar|/\k| \k \k keyword character (see 'iskeyword' option) 507071d4279SBram Moolenaar|/\K| \K \K like "\k", but excluding digits 508071d4279SBram Moolenaar|/\f| \f \f file name character (see 'isfname' option) 509071d4279SBram Moolenaar|/\F| \F \F like "\f", but excluding digits 510071d4279SBram Moolenaar|/\p| \p \p printable character (see 'isprint' option) 511071d4279SBram Moolenaar|/\P| \P \P like "\p", but excluding digits 512071d4279SBram Moolenaar|/\s| \s \s whitespace character: <Space> and <Tab> 513071d4279SBram Moolenaar|/\S| \S \S non-whitespace character; opposite of \s 514071d4279SBram Moolenaar|/\d| \d \d digit: [0-9] 515071d4279SBram Moolenaar|/\D| \D \D non-digit: [^0-9] 516071d4279SBram Moolenaar|/\x| \x \x hex digit: [0-9A-Fa-f] 517071d4279SBram Moolenaar|/\X| \X \X non-hex digit: [^0-9A-Fa-f] 518071d4279SBram Moolenaar|/\o| \o \o octal digit: [0-7] 519071d4279SBram Moolenaar|/\O| \O \O non-octal digit: [^0-7] 520071d4279SBram Moolenaar|/\w| \w \w word character: [0-9A-Za-z_] 521071d4279SBram Moolenaar|/\W| \W \W non-word character: [^0-9A-Za-z_] 522071d4279SBram Moolenaar|/\h| \h \h head of word character: [A-Za-z_] 523071d4279SBram Moolenaar|/\H| \H \H non-head of word character: [^A-Za-z_] 524071d4279SBram Moolenaar|/\a| \a \a alphabetic character: [A-Za-z] 525071d4279SBram Moolenaar|/\A| \A \A non-alphabetic character: [^A-Za-z] 526071d4279SBram Moolenaar|/\l| \l \l lowercase character: [a-z] 527071d4279SBram Moolenaar|/\L| \L \L non-lowercase character: [^a-z] 528071d4279SBram Moolenaar|/\u| \u \u uppercase character: [A-Z] 529071d4279SBram Moolenaar|/\U| \U \U non-uppercase character [^A-Z] 530071d4279SBram Moolenaar|/\_| \_x \_x where x is any of the characters above: character 531071d4279SBram Moolenaar class with end-of-line included 532071d4279SBram Moolenaar(end of character classes) 533071d4279SBram Moolenaar 534256972a9SBram Moolenaar magic nomagic matches ~ 535071d4279SBram Moolenaar|/\e| \e \e <Esc> 536071d4279SBram Moolenaar|/\t| \t \t <Tab> 537071d4279SBram Moolenaar|/\r| \r \r <CR> 538071d4279SBram Moolenaar|/\b| \b \b <BS> 539071d4279SBram Moolenaar|/\n| \n \n end-of-line 540071d4279SBram Moolenaar|/~| ~ \~ last given substitute string 54125c9c680SBram Moolenaar|/\1| \1 \1 same string as matched by first \(\) 542071d4279SBram Moolenaar|/\2| \2 \2 Like "\1", but uses second \(\) 543071d4279SBram Moolenaar ... 544071d4279SBram Moolenaar|/\9| \9 \9 Like "\1", but uses ninth \(\) 545071d4279SBram Moolenaar *E68* 546071d4279SBram Moolenaar|/\z1| \z1 \z1 only for syntax highlighting, see |:syn-ext-match| 547071d4279SBram Moolenaar ... 548071d4279SBram Moolenaar|/\z1| \z9 \z9 only for syntax highlighting, see |:syn-ext-match| 549071d4279SBram Moolenaar 550071d4279SBram Moolenaar x x a character with no special meaning matches itself 551071d4279SBram Moolenaar 552071d4279SBram Moolenaar|/[]| [] \[] any character specified inside the [] 553c0197e28SBram Moolenaar|/\%[]| \%[] \%[] a sequence of optionally matched atoms 554071d4279SBram Moolenaar 5553577c6faSBram Moolenaar|/\c| \c \c ignore case, do not use the 'ignorecase' option 5563577c6faSBram Moolenaar|/\C| \C \C match case, do not use the 'ignorecase' option 557fbc0d2eaSBram Moolenaar|/\Z| \Z \Z ignore differences in Unicode "combining characters". 558fbc0d2eaSBram Moolenaar Useful when searching voweled Hebrew or Arabic text. 559fbc0d2eaSBram Moolenaar 560256972a9SBram Moolenaar magic nomagic matches ~ 561071d4279SBram Moolenaar|/\m| \m \m 'magic' on for the following chars in the pattern 562071d4279SBram Moolenaar|/\M| \M \M 'magic' off for the following chars in the pattern 563071d4279SBram Moolenaar|/\v| \v \v the following chars in the pattern are "very magic" 564071d4279SBram Moolenaar|/\V| \V \V the following chars in the pattern are "very nomagic" 565fbc0d2eaSBram Moolenaar|/\%#=| \%#=1 \%#=1 select regexp engine |/zero-width| 566071d4279SBram Moolenaar 5678f3f58f2SBram Moolenaar|/\%d| \%d \%d match specified decimal character (eg \%d123) 568c0197e28SBram Moolenaar|/\%x| \%x \%x match specified hex character (eg \%x2a) 569c0197e28SBram Moolenaar|/\%o| \%o \%o match specified octal character (eg \%o040) 570c0197e28SBram Moolenaar|/\%u| \%u \%u match specified multibyte character (eg \%u20ac) 571c0197e28SBram Moolenaar|/\%U| \%U \%U match specified large multibyte character (eg 572c0197e28SBram Moolenaar \%U12345678) 5738df5acfdSBram Moolenaar|/\%C| \%C \%C match any composing characters 574071d4279SBram Moolenaar 575071d4279SBram MoolenaarExample matches ~ 576071d4279SBram Moolenaar\<\I\i* or 577071d4279SBram Moolenaar\<\h\w* 578071d4279SBram Moolenaar\<[a-zA-Z_][a-zA-Z0-9_]* 579071d4279SBram Moolenaar An identifier (e.g., in a C program). 580071d4279SBram Moolenaar 581071d4279SBram Moolenaar\(\.$\|\. \) A period followed by <EOL> or a space. 582071d4279SBram Moolenaar 583071d4279SBram Moolenaar[.!?][])"']*\($\|[ ]\) A search pattern that finds the end of a sentence, 584071d4279SBram Moolenaar with almost the same definition as the ")" command. 585071d4279SBram Moolenaar 586071d4279SBram Moolenaarcat\Z Both "cat" and "càt" ("a" followed by 0x0300) 587071d4279SBram Moolenaar Does not match "càt" (character 0x00e0), even 588071d4279SBram Moolenaar though it may look the same. 589071d4279SBram Moolenaar 590071d4279SBram Moolenaar 591071d4279SBram Moolenaar============================================================================== 592071d4279SBram Moolenaar5. Multi items *pattern-multi-items* 593071d4279SBram Moolenaar 594071d4279SBram MoolenaarAn atom can be followed by an indication of how many times the atom can be 595071d4279SBram Moolenaarmatched and in what way. This is called a multi. See |/multi| for an 596071d4279SBram Moolenaaroverview. 597071d4279SBram Moolenaar 598aa3b15dbSBram Moolenaar */star* */\star* 599071d4279SBram Moolenaar* (use \* when 'magic' is not set) 600071d4279SBram Moolenaar Matches 0 or more of the preceding atom, as many as possible. 601071d4279SBram Moolenaar Example 'nomagic' matches ~ 602071d4279SBram Moolenaar a* a\* "", "a", "aa", "aaa", etc. 603071d4279SBram Moolenaar .* \.\* anything, also an empty string, no end-of-line 604071d4279SBram Moolenaar \_.* \_.\* everything up to the end of the buffer 605071d4279SBram Moolenaar \_.*END \_.\*END everything up to and including the last "END" 606071d4279SBram Moolenaar in the buffer 607071d4279SBram Moolenaar 608071d4279SBram Moolenaar Exception: When "*" is used at the start of the pattern or just after 609071d4279SBram Moolenaar "^" it matches the star character. 610071d4279SBram Moolenaar 611071d4279SBram Moolenaar Be aware that repeating "\_." can match a lot of text and take a long 612071d4279SBram Moolenaar time. For example, "\_.*END" matches all text from the current 613071d4279SBram Moolenaar position to the last occurrence of "END" in the file. Since the "*" 614071d4279SBram Moolenaar will match as many as possible, this first skips over all lines until 615071d4279SBram Moolenaar the end of the file and then tries matching "END", backing up one 616071d4279SBram Moolenaar character at a time. 617071d4279SBram Moolenaar 618aa3b15dbSBram Moolenaar */\+* 61925c9c680SBram Moolenaar\+ Matches 1 or more of the preceding atom, as many as possible. 620071d4279SBram Moolenaar Example matches ~ 621071d4279SBram Moolenaar ^.\+$ any non-empty line 622071d4279SBram Moolenaar \s\+ white space of at least one character 623071d4279SBram Moolenaar 624071d4279SBram Moolenaar */\=* 62525c9c680SBram Moolenaar\= Matches 0 or 1 of the preceding atom, as many as possible. 626071d4279SBram Moolenaar Example matches ~ 627071d4279SBram Moolenaar foo\= "fo" and "foo" 628071d4279SBram Moolenaar 629071d4279SBram Moolenaar */\?* 630071d4279SBram Moolenaar\? Just like \=. Cannot be used when searching backwards with the "?" 63125c9c680SBram Moolenaar command. 632071d4279SBram Moolenaar 633aa3b15dbSBram Moolenaar */\{* *E60* *E554* *E870* 634071d4279SBram Moolenaar\{n,m} Matches n to m of the preceding atom, as many as possible 635071d4279SBram Moolenaar\{n} Matches n of the preceding atom 636071d4279SBram Moolenaar\{n,} Matches at least n of the preceding atom, as many as possible 637071d4279SBram Moolenaar\{,m} Matches 0 to m of the preceding atom, as many as possible 638071d4279SBram Moolenaar\{} Matches 0 or more of the preceding atom, as many as possible (like *) 639071d4279SBram Moolenaar */\{-* 640071d4279SBram Moolenaar\{-n,m} matches n to m of the preceding atom, as few as possible 641071d4279SBram Moolenaar\{-n} matches n of the preceding atom 642071d4279SBram Moolenaar\{-n,} matches at least n of the preceding atom, as few as possible 643071d4279SBram Moolenaar\{-,m} matches 0 to m of the preceding atom, as few as possible 644071d4279SBram Moolenaar\{-} matches 0 or more of the preceding atom, as few as possible 645071d4279SBram Moolenaar 64626a60b45SBram Moolenaar n and m are positive decimal numbers or zero 647c81e5e79SBram Moolenaar *non-greedy* 648071d4279SBram Moolenaar If a "-" appears immediately after the "{", then a shortest match 649071d4279SBram Moolenaar first algorithm is used (see example below). In particular, "\{-}" is 650071d4279SBram Moolenaar the same as "*" but uses the shortest match first algorithm. BUT: A 651071d4279SBram Moolenaar match that starts earlier is preferred over a shorter match: "a\{-}b" 652071d4279SBram Moolenaar matches "aaab" in "xaaab". 653071d4279SBram Moolenaar 654071d4279SBram Moolenaar Example matches ~ 655071d4279SBram Moolenaar ab\{2,3}c "abbc" or "abbbc" 6563577c6faSBram Moolenaar a\{5} "aaaaa" 6573577c6faSBram Moolenaar ab\{2,}c "abbc", "abbbc", "abbbbc", etc. 6583577c6faSBram Moolenaar ab\{,3}c "ac", "abc", "abbc" or "abbbc" 659071d4279SBram Moolenaar a[bc]\{3}d "abbbd", "abbcd", "acbcd", "acccd", etc. 660071d4279SBram Moolenaar a\(bc\)\{1,2}d "abcd" or "abcbcd" 661071d4279SBram Moolenaar a[bc]\{-}[cd] "abc" in "abcd" 662071d4279SBram Moolenaar a[bc]*[cd] "abcd" in "abcd" 663071d4279SBram Moolenaar 664071d4279SBram Moolenaar The } may optionally be preceded with a backslash: \{n,m\}. 665071d4279SBram Moolenaar 666071d4279SBram Moolenaar */\@=* 66725c9c680SBram Moolenaar\@= Matches the preceding atom with zero width. 668071d4279SBram Moolenaar Like "(?=pattern)" in Perl. 669071d4279SBram Moolenaar Example matches ~ 670071d4279SBram Moolenaar foo\(bar\)\@= "foo" in "foobar" 671071d4279SBram Moolenaar foo\(bar\)\@=foo nothing 672071d4279SBram Moolenaar */zero-width* 673071d4279SBram Moolenaar When using "\@=" (or "^", "$", "\<", "\>") no characters are included 674071d4279SBram Moolenaar in the match. These items are only used to check if a match can be 675071d4279SBram Moolenaar made. This can be tricky, because a match with following items will 676071d4279SBram Moolenaar be done in the same position. The last example above will not match 677071d4279SBram Moolenaar "foobarfoo", because it tries match "foo" in the same position where 678071d4279SBram Moolenaar "bar" matched. 679071d4279SBram Moolenaar 680071d4279SBram Moolenaar Note that using "\&" works the same as using "\@=": "foo\&.." is the 681071d4279SBram Moolenaar same as "\(foo\)\@=..". But using "\&" is easier, you don't need the 6821b884a00SBram Moolenaar parentheses. 683071d4279SBram Moolenaar 684071d4279SBram Moolenaar 685071d4279SBram Moolenaar */\@!* 686071d4279SBram Moolenaar\@! Matches with zero width if the preceding atom does NOT match at the 68725c9c680SBram Moolenaar current position. |/zero-width| 6881aeaf8c0SBram Moolenaar Like "(?!pattern)" in Perl. 689071d4279SBram Moolenaar Example matches ~ 690071d4279SBram Moolenaar foo\(bar\)\@! any "foo" not followed by "bar" 6911aeaf8c0SBram Moolenaar a.\{-}p\@! "a", "ap", "app", "appp", etc. not immediately 692251e1912SBram Moolenaar followed by a "p" 693071d4279SBram Moolenaar if \(\(then\)\@!.\)*$ "if " not followed by "then" 694071d4279SBram Moolenaar 695071d4279SBram Moolenaar Using "\@!" is tricky, because there are many places where a pattern 696071d4279SBram Moolenaar does not match. "a.*p\@!" will match from an "a" to the end of the 697071d4279SBram Moolenaar line, because ".*" can match all characters in the line and the "p" 698071d4279SBram Moolenaar doesn't match at the end of the line. "a.\{-}p\@!" will match any 6991aeaf8c0SBram Moolenaar "a", "ap", "app", etc. that isn't followed by a "p", because the "." 700071d4279SBram Moolenaar can match a "p" and "p\@!" doesn't match after that. 701071d4279SBram Moolenaar 702071d4279SBram Moolenaar You can't use "\@!" to look for a non-match before the matching 703071d4279SBram Moolenaar position: "\(foo\)\@!bar" will match "bar" in "foobar", because at the 704071d4279SBram Moolenaar position where "bar" matches, "foo" does not match. To avoid matching 705071d4279SBram Moolenaar "foobar" you could use "\(foo\)\@!...bar", but that doesn't match a 706071d4279SBram Moolenaar bar at the start of a line. Use "\(foo\)\@<!bar". 707071d4279SBram Moolenaar 7088e5af3e5SBram Moolenaar Useful example: to find "foo" in a line that does not contain "bar": > 7098e5af3e5SBram Moolenaar /^\%(.*bar\)\@!.*\zsfoo 7108e5af3e5SBram Moolenaar< This pattern first checks that there is not a single position in the 7118e5af3e5SBram Moolenaar line where "bar" matches. If ".*bar" matches somewhere the \@! will 7128e5af3e5SBram Moolenaar reject the pattern. When there is no match any "foo" will be found. 7138e5af3e5SBram Moolenaar The "\zs" is to have the match start just before "foo". 7148e5af3e5SBram Moolenaar 715071d4279SBram Moolenaar */\@<=* 716071d4279SBram Moolenaar\@<= Matches with zero width if the preceding atom matches just before what 71725c9c680SBram Moolenaar follows. |/zero-width| 7181aeaf8c0SBram Moolenaar Like "(?<=pattern)" in Perl, but Vim allows non-fixed-width patterns. 719071d4279SBram Moolenaar Example matches ~ 720071d4279SBram Moolenaar \(an\_s\+\)\@<=file "file" after "an" and white space or an 721071d4279SBram Moolenaar end-of-line 722071d4279SBram Moolenaar For speed it's often much better to avoid this multi. Try using "\zs" 723071d4279SBram Moolenaar instead |/\zs|. To match the same as the above example: 724071d4279SBram Moolenaar an\_s\+\zsfile 725543b7ef7SBram Moolenaar At least set a limit for the look-behind, see below. 726071d4279SBram Moolenaar 727071d4279SBram Moolenaar "\@<=" and "\@<!" check for matches just before what follows. 728071d4279SBram Moolenaar Theoretically these matches could start anywhere before this position. 729071d4279SBram Moolenaar But to limit the time needed, only the line where what follows matches 730071d4279SBram Moolenaar is searched, and one line before that (if there is one). This should 731071d4279SBram Moolenaar be sufficient to match most things and not be too slow. 732fb539273SBram Moolenaar 733fb539273SBram Moolenaar In the old regexp engine the part of the pattern after "\@<=" and 734fb539273SBram Moolenaar "\@<!" are checked for a match first, thus things like "\1" don't work 735fb539273SBram Moolenaar to reference \(\) inside the preceding atom. It does work the other 736fb539273SBram Moolenaar way around: 737fb539273SBram Moolenaar Bad example matches ~ 738fb539273SBram Moolenaar \%#=1\1\@<=,\([a-z]\+\) ",abc" in "abc,abc" 739fb539273SBram Moolenaar 740fb539273SBram Moolenaar However, the new regexp engine works differently, it is better to not 741fb539273SBram Moolenaar rely on this behavior, do not use \@<= if it can be avoided: 742071d4279SBram Moolenaar Example matches ~ 743fb539273SBram Moolenaar \([a-z]\+\)\zs,\1 ",abc" in "abc,abc" 744071d4279SBram Moolenaar 745543b7ef7SBram Moolenaar\@123<= 746543b7ef7SBram Moolenaar Like "\@<=" but only look back 123 bytes. This avoids trying lots 747543b7ef7SBram Moolenaar of matches that are known to fail and make executing the pattern very 748543b7ef7SBram Moolenaar slow. Example, check if there is a "<" just before "span": 749543b7ef7SBram Moolenaar /<\@1<=span 750543b7ef7SBram Moolenaar This will try matching "<" only one byte before "span", which is the 751543b7ef7SBram Moolenaar only place that works anyway. 752543b7ef7SBram Moolenaar After crossing a line boundary, the limit is relative to the end of 753543b7ef7SBram Moolenaar the line. Thus the characters at the start of the line with the match 754543b7ef7SBram Moolenaar are not counted (this is just to keep it simple). 755543b7ef7SBram Moolenaar The number zero is the same as no limit. 756543b7ef7SBram Moolenaar 757071d4279SBram Moolenaar */\@<!* 758071d4279SBram Moolenaar\@<! Matches with zero width if the preceding atom does NOT match just 759071d4279SBram Moolenaar before what follows. Thus this matches if there is no position in the 760071d4279SBram Moolenaar current or previous line where the atom matches such that it ends just 76125c9c680SBram Moolenaar before what follows. |/zero-width| 7621aeaf8c0SBram Moolenaar Like "(?<!pattern)" in Perl, but Vim allows non-fixed-width patterns. 763071d4279SBram Moolenaar The match with the preceding atom is made to end just before the match 764071d4279SBram Moolenaar with what follows, thus an atom that ends in ".*" will work. 765071d4279SBram Moolenaar Warning: This can be slow (because many positions need to be checked 766543b7ef7SBram Moolenaar for a match). Use a limit if you can, see below. 767071d4279SBram Moolenaar Example matches ~ 768071d4279SBram Moolenaar \(foo\)\@<!bar any "bar" that's not in "foobar" 7693577c6faSBram Moolenaar \(\/\/.*\)\@<!in "in" which is not after "//" 770071d4279SBram Moolenaar 771543b7ef7SBram Moolenaar\@123<! 772543b7ef7SBram Moolenaar Like "\@<!" but only look back 123 bytes. This avoids trying lots of 773543b7ef7SBram Moolenaar matches that are known to fail and make executing the pattern very 774543b7ef7SBram Moolenaar slow. 775543b7ef7SBram Moolenaar 776071d4279SBram Moolenaar */\@>* 77725c9c680SBram Moolenaar\@> Matches the preceding atom like matching a whole pattern. 7783577c6faSBram Moolenaar Like "(?>pattern)" in Perl. 779071d4279SBram Moolenaar Example matches ~ 780071d4279SBram Moolenaar \(a*\)\@>a nothing (the "a*" takes all the "a"'s, there can't be 781071d4279SBram Moolenaar another one following) 782071d4279SBram Moolenaar 783071d4279SBram Moolenaar This matches the preceding atom as if it was a pattern by itself. If 784071d4279SBram Moolenaar it doesn't match, there is no retry with shorter sub-matches or 785071d4279SBram Moolenaar anything. Observe this difference: "a*b" and "a*ab" both match 786071d4279SBram Moolenaar "aaab", but in the second case the "a*" matches only the first two 787071d4279SBram Moolenaar "a"s. "\(a*\)\@>ab" will not match "aaab", because the "a*" matches 788071d4279SBram Moolenaar the "aaa" (as many "a"s as possible), thus the "ab" can't match. 789071d4279SBram Moolenaar 790071d4279SBram Moolenaar 791071d4279SBram Moolenaar============================================================================== 792071d4279SBram Moolenaar6. Ordinary atoms *pattern-atoms* 793071d4279SBram Moolenaar 794071d4279SBram MoolenaarAn ordinary atom can be: 795071d4279SBram Moolenaar 796071d4279SBram Moolenaar */^* 797071d4279SBram Moolenaar^ At beginning of pattern or after "\|", "\(", "\%(" or "\n": matches 798071d4279SBram Moolenaar start-of-line; at other positions, matches literal '^'. |/zero-width| 799071d4279SBram Moolenaar Example matches ~ 800071d4279SBram Moolenaar ^beep( the start of the C function "beep" (probably). 801071d4279SBram Moolenaar 802071d4279SBram Moolenaar */\^* 8031c6737b2SBram Moolenaar\^ Matches literal '^'. Can be used at any position in the pattern, but 8041c6737b2SBram Moolenaar not inside []. 805071d4279SBram Moolenaar 806071d4279SBram Moolenaar */\_^* 807071d4279SBram Moolenaar\_^ Matches start-of-line. |/zero-width| Can be used at any position in 8081c6737b2SBram Moolenaar the pattern, but not inside []. 809071d4279SBram Moolenaar Example matches ~ 810071d4279SBram Moolenaar \_s*\_^foo white space and blank lines and then "foo" at 811071d4279SBram Moolenaar start-of-line 812071d4279SBram Moolenaar 813071d4279SBram Moolenaar */$* 8143577c6faSBram Moolenaar$ At end of pattern or in front of "\|", "\)" or "\n" ('magic' on): 815071d4279SBram Moolenaar matches end-of-line <EOL>; at other positions, matches literal '$'. 816071d4279SBram Moolenaar |/zero-width| 817071d4279SBram Moolenaar 818071d4279SBram Moolenaar */\$* 8191c6737b2SBram Moolenaar\$ Matches literal '$'. Can be used at any position in the pattern, but 8201c6737b2SBram Moolenaar not inside []. 821071d4279SBram Moolenaar 822071d4279SBram Moolenaar */\_$* 823071d4279SBram Moolenaar\_$ Matches end-of-line. |/zero-width| Can be used at any position in the 8241c6737b2SBram Moolenaar pattern, but not inside []. Note that "a\_$b" never matches, since 8251c6737b2SBram Moolenaar "b" cannot match an end-of-line. Use "a\nb" instead |/\n|. 826071d4279SBram Moolenaar Example matches ~ 827071d4279SBram Moolenaar foo\_$\_s* "foo" at end-of-line and following white space and 828071d4279SBram Moolenaar blank lines 829071d4279SBram Moolenaar 830071d4279SBram Moolenaar. (with 'nomagic': \.) */.* */\.* 831071d4279SBram Moolenaar Matches any single character, but not an end-of-line. 832071d4279SBram Moolenaar 833071d4279SBram Moolenaar */\_.* 834071d4279SBram Moolenaar\_. Matches any single character or end-of-line. 835071d4279SBram Moolenaar Careful: "\_.*" matches all text to the end of the buffer! 836071d4279SBram Moolenaar 837071d4279SBram Moolenaar */\<* 838071d4279SBram Moolenaar\< Matches the beginning of a word: The next char is the first char of a 839071d4279SBram Moolenaar word. The 'iskeyword' option specifies what is a word character. 840071d4279SBram Moolenaar |/zero-width| 841071d4279SBram Moolenaar 842071d4279SBram Moolenaar */\>* 843071d4279SBram Moolenaar\> Matches the end of a word: The previous char is the last char of a 844071d4279SBram Moolenaar word. The 'iskeyword' option specifies what is a word character. 845071d4279SBram Moolenaar |/zero-width| 846071d4279SBram Moolenaar 847071d4279SBram Moolenaar */\zs* 8481c6737b2SBram Moolenaar\zs Matches at any position, but not inside [], and sets the start of the 8491c6737b2SBram Moolenaar match there: The next char is the first char of the whole match. 8501c6737b2SBram Moolenaar |/zero-width| 851071d4279SBram Moolenaar Example: > 852071d4279SBram Moolenaar /^\s*\zsif 853071d4279SBram Moolenaar< matches an "if" at the start of a line, ignoring white space. 854071d4279SBram Moolenaar Can be used multiple times, the last one encountered in a matching 855071d4279SBram Moolenaar branch is used. Example: > 856071d4279SBram Moolenaar /\(.\{-}\zsFab\)\{3} 857071d4279SBram Moolenaar< Finds the third occurrence of "Fab". 85834401ccaSBram Moolenaar This cannot be followed by a multi. *E888* 85925c9c680SBram Moolenaar {not available when compiled without the |+syntax| feature} 860071d4279SBram Moolenaar */\ze* 8611c6737b2SBram Moolenaar\ze Matches at any position, but not inside [], and sets the end of the 8621c6737b2SBram Moolenaar match there: The previous char is the last char of the whole match. 8631c6737b2SBram Moolenaar |/zero-width| 864071d4279SBram Moolenaar Can be used multiple times, the last one encountered in a matching 865071d4279SBram Moolenaar branch is used. 866071d4279SBram Moolenaar Example: "end\ze\(if\|for\)" matches the "end" in "endif" and 867071d4279SBram Moolenaar "endfor". 8686e932461SBram Moolenaar This cannot be followed by a multi. |E888| 86925c9c680SBram Moolenaar {not available when compiled without the |+syntax| feature} 870071d4279SBram Moolenaar 871071d4279SBram Moolenaar */\%^* *start-of-file* 872071d4279SBram Moolenaar\%^ Matches start of the file. When matching with a string, matches the 87325c9c680SBram Moolenaar start of the string. 874071d4279SBram Moolenaar For example, to find the first "VIM" in a file: > 875071d4279SBram Moolenaar /\%^\_.\{-}\zsVIM 876071d4279SBram Moolenaar< 877071d4279SBram Moolenaar */\%$* *end-of-file* 878071d4279SBram Moolenaar\%$ Matches end of the file. When matching with a string, matches the 87925c9c680SBram Moolenaar end of the string. 880071d4279SBram Moolenaar Note that this does NOT find the last "VIM" in a file: > 881071d4279SBram Moolenaar /VIM\_.\{-}\%$ 882071d4279SBram Moolenaar< It will find the next VIM, because the part after it will always 883071d4279SBram Moolenaar match. This one will find the last "VIM" in the file: > 884071d4279SBram Moolenaar /VIM\ze\(\(VIM\)\@!\_.\)*\%$ 885071d4279SBram Moolenaar< This uses |/\@!| to ascertain that "VIM" does NOT match in any 886071d4279SBram Moolenaar position after the first "VIM". 887071d4279SBram Moolenaar Searching from the end of the file backwards is easier! 888071d4279SBram Moolenaar 88933aec765SBram Moolenaar */\%V* 89033aec765SBram Moolenaar\%V Match inside the Visual area. When Visual mode has already been 89133aec765SBram Moolenaar stopped match in the area that |gv| would reselect. 8928f3f58f2SBram Moolenaar This is a |/zero-width| match. To make sure the whole pattern is 893214641f7SBram Moolenaar inside the Visual area put it at the start and just before the end of 894214641f7SBram Moolenaar the pattern, e.g.: > 895214641f7SBram Moolenaar /\%Vfoo.*ba\%Vr 896036986f1SBram Moolenaar< This also works if only "foo bar" was Visually selected. This: > 897036986f1SBram Moolenaar /\%Vfoo.*bar\%V 898214641f7SBram Moolenaar< would match "foo bar" if the Visual selection continues after the "r". 899214641f7SBram Moolenaar Only works for the current buffer. 90033aec765SBram Moolenaar 901071d4279SBram Moolenaar */\%#* *cursor-position* 902071d4279SBram Moolenaar\%# Matches with the cursor position. Only works when matching in a 90325c9c680SBram Moolenaar buffer displayed in a window. 904071d4279SBram Moolenaar WARNING: When the cursor is moved after the pattern was used, the 905071d4279SBram Moolenaar result becomes invalid. Vim doesn't automatically update the matches. 906071d4279SBram Moolenaar This is especially relevant for syntax highlighting and 'hlsearch'. 907071d4279SBram Moolenaar In other words: When the cursor moves the display isn't updated for 908071d4279SBram Moolenaar this change. An update is done for lines which are changed (the whole 909071d4279SBram Moolenaar line is updated) or when using the |CTRL-L| command (the whole screen 910071d4279SBram Moolenaar is updated). Example, to highlight the word under the cursor: > 911071d4279SBram Moolenaar /\k*\%#\k* 912071d4279SBram Moolenaar< When 'hlsearch' is set and you move the cursor around and make changes 913071d4279SBram Moolenaar this will clearly show when the match is updated or not. 914071d4279SBram Moolenaar 91533aec765SBram Moolenaar */\%'m* */\%<'m* */\%>'m* 91633aec765SBram Moolenaar\%'m Matches with the position of mark m. 91733aec765SBram Moolenaar\%<'m Matches before the position of mark m. 91833aec765SBram Moolenaar\%>'m Matches after the position of mark m. 91933aec765SBram Moolenaar Example, to highlight the text from mark 's to 'e: > 92033aec765SBram Moolenaar /.\%>'s.*\%<'e.. 92133aec765SBram Moolenaar< Note that two dots are required to include mark 'e in the match. That 92233aec765SBram Moolenaar is because "\%<'e" matches at the character before the 'e mark, and 92333aec765SBram Moolenaar since it's a |/zero-width| match it doesn't include that character. 92433aec765SBram Moolenaar WARNING: When the mark is moved after the pattern was used, the result 92533aec765SBram Moolenaar becomes invalid. Vim doesn't automatically update the matches. 9261ef15e30SBram Moolenaar Similar to moving the cursor for "\%#" |/\%#|. 92733aec765SBram Moolenaar 9287254067eSBram Moolenaar */\%l* */\%>l* */\%<l* *E951* 929071d4279SBram Moolenaar\%23l Matches in a specific line. 9304770d09aSBram Moolenaar\%<23l Matches above a specific line (lower line number). 9314770d09aSBram Moolenaar\%>23l Matches below a specific line (higher line number). 93204db26b3SBram Moolenaar\%.l Matches at the cursor line. 93304db26b3SBram Moolenaar\%<.l Matches above the cursor line. 93404db26b3SBram Moolenaar\%>.l Matches below the cursor line. 935*2286304cSBram Moolenaar These six can be used to match specific lines in a buffer. The "23" 93625c9c680SBram Moolenaar can be any line number. The first line is 1. 937071d4279SBram Moolenaar WARNING: When inserting or deleting lines Vim does not automatically 938071d4279SBram Moolenaar update the matches. This means Syntax highlighting quickly becomes 93953f7fcccSBram Moolenaar wrong. Also when referring to the cursor position (".") and 94004db26b3SBram Moolenaar the cursor moves the display isn't updated for this change. An update 94104db26b3SBram Moolenaar is done when using the |CTRL-L| command (the whole screen is updated). 942071d4279SBram Moolenaar Example, to highlight the line where the cursor currently is: > 94304db26b3SBram Moolenaar :exe '/\%' . line(".") . 'l' 94404db26b3SBram Moolenaar< Alternatively use: > 94504db26b3SBram Moolenaar /\%.l 946071d4279SBram Moolenaar< When 'hlsearch' is set and you move the cursor around and make changes 947071d4279SBram Moolenaar this will clearly show when the match is updated or not. 948071d4279SBram Moolenaar 949071d4279SBram Moolenaar */\%c* */\%>c* */\%<c* 950071d4279SBram Moolenaar\%23c Matches in a specific column. 951071d4279SBram Moolenaar\%<23c Matches before a specific column. 952071d4279SBram Moolenaar\%>23c Matches after a specific column. 95304db26b3SBram Moolenaar\%.c Matches at the cursor column. 95404db26b3SBram Moolenaar\%<.c Matches before the cursor column. 95504db26b3SBram Moolenaar\%>.c Matches after the cursor column. 956*2286304cSBram Moolenaar These six can be used to match specific columns in a buffer or string. 957*2286304cSBram Moolenaar The "23" can be any column number. The first column is 1. Actually, 958*2286304cSBram Moolenaar the column is the byte number (thus it's not exactly right for 959*2286304cSBram Moolenaar multibyte characters). 960071d4279SBram Moolenaar WARNING: When inserting or deleting text Vim does not automatically 961071d4279SBram Moolenaar update the matches. This means Syntax highlighting quickly becomes 96253f7fcccSBram Moolenaar wrong. Also when referring to the cursor position (".") and 96304db26b3SBram Moolenaar the cursor moves the display isn't updated for this change. An update 96404db26b3SBram Moolenaar is done when using the |CTRL-L| command (the whole screen is updated). 965071d4279SBram Moolenaar Example, to highlight the column where the cursor currently is: > 966071d4279SBram Moolenaar :exe '/\%' . col(".") . 'c' 96704db26b3SBram Moolenaar< Alternatively use: > 96804db26b3SBram Moolenaar /\%.c 969071d4279SBram Moolenaar< When 'hlsearch' is set and you move the cursor around and make changes 970071d4279SBram Moolenaar this will clearly show when the match is updated or not. 971071d4279SBram Moolenaar Example for matching a single byte in column 44: > 972071d4279SBram Moolenaar /\%>43c.\%<46c 973071d4279SBram Moolenaar< Note that "\%<46c" matches in column 45 when the "." matches a byte in 974071d4279SBram Moolenaar column 44. 975071d4279SBram Moolenaar */\%v* */\%>v* */\%<v* 976071d4279SBram Moolenaar\%23v Matches in a specific virtual column. 977071d4279SBram Moolenaar\%<23v Matches before a specific virtual column. 978071d4279SBram Moolenaar\%>23v Matches after a specific virtual column. 97904db26b3SBram Moolenaar\%.v Matches at the current virtual column. 98004db26b3SBram Moolenaar\%<.v Matches before the current virtual column. 98104db26b3SBram Moolenaar\%>.v Matches after the current virtual column. 982*2286304cSBram Moolenaar These six can be used to match specific virtual columns in a buffer or 983*2286304cSBram Moolenaar string. When not matching with a buffer in a window, the option 984071d4279SBram Moolenaar values of the current window are used (e.g., 'tabstop'). 985071d4279SBram Moolenaar The "23" can be any column number. The first column is 1. 986071d4279SBram Moolenaar Note that some virtual column positions will never match, because they 98769c2f17eSBram Moolenaar are halfway through a tab or other character that occupies more than 98825c9c680SBram Moolenaar one screen character. 989071d4279SBram Moolenaar WARNING: When inserting or deleting text Vim does not automatically 990de934d77SBram Moolenaar update highlighted matches. This means Syntax highlighting quickly 99153f7fcccSBram Moolenaar becomes wrong. Also when referring to the cursor position (".") and 99204db26b3SBram Moolenaar the cursor moves the display isn't updated for this change. An update 99304db26b3SBram Moolenaar is done when using the |CTRL-L| command (the whole screen is updated). 9943577c6faSBram Moolenaar Example, to highlight all the characters after virtual column 72: > 995071d4279SBram Moolenaar /\%>72v.* 996071d4279SBram Moolenaar< When 'hlsearch' is set and you move the cursor around and make changes 997071d4279SBram Moolenaar this will clearly show when the match is updated or not. 998071d4279SBram Moolenaar To match the text up to column 17: > 999c95a302aSBram Moolenaar /^.*\%17v 100004db26b3SBram Moolenaar< To match all characters after the current virtual column (where the 100104db26b3SBram Moolenaar cursor is): > 100204db26b3SBram Moolenaar /\%>.v.* 1003c95a302aSBram Moolenaar< Column 17 is not included, because this is a |/zero-width| match. To 1004c95a302aSBram Moolenaar include the column use: > 1005c95a302aSBram Moolenaar /^.*\%17v. 10068f3f58f2SBram Moolenaar< This command does the same thing, but also matches when there is no 10078f3f58f2SBram Moolenaar character in column 17: > 1008c95a302aSBram Moolenaar /^.*\%<18v. 1009c95a302aSBram Moolenaar< Note that without the "^" to anchor the match in the first column, 1010c95a302aSBram Moolenaar this will also highlight column 17: > 1011c95a302aSBram Moolenaar /.*\%17v 1012c95a302aSBram Moolenaar< Column 17 is highlighted by 'hlsearch' because there is another match 1013c95a302aSBram Moolenaar where ".*" matches zero characters. 1014*2286304cSBram Moolenaar 1015071d4279SBram Moolenaar 101625c9c680SBram MoolenaarCharacter classes: 1017071d4279SBram Moolenaar\i identifier character (see 'isident' option) */\i* 1018071d4279SBram Moolenaar\I like "\i", but excluding digits */\I* 1019071d4279SBram Moolenaar\k keyword character (see 'iskeyword' option) */\k* 1020071d4279SBram Moolenaar\K like "\k", but excluding digits */\K* 1021071d4279SBram Moolenaar\f file name character (see 'isfname' option) */\f* 1022071d4279SBram Moolenaar\F like "\f", but excluding digits */\F* 1023071d4279SBram Moolenaar\p printable character (see 'isprint' option) */\p* 1024071d4279SBram Moolenaar\P like "\p", but excluding digits */\P* 1025071d4279SBram Moolenaar 1026207f0093SBram MoolenaarNOTE: the above also work for multibyte characters. The ones below only 1027071d4279SBram Moolenaarmatch ASCII characters, as indicated by the range. 1028071d4279SBram Moolenaar 1029071d4279SBram Moolenaar *whitespace* *white-space* 1030071d4279SBram Moolenaar\s whitespace character: <Space> and <Tab> */\s* 1031071d4279SBram Moolenaar\S non-whitespace character; opposite of \s */\S* 1032071d4279SBram Moolenaar\d digit: [0-9] */\d* 1033071d4279SBram Moolenaar\D non-digit: [^0-9] */\D* 1034071d4279SBram Moolenaar\x hex digit: [0-9A-Fa-f] */\x* 1035071d4279SBram Moolenaar\X non-hex digit: [^0-9A-Fa-f] */\X* 1036071d4279SBram Moolenaar\o octal digit: [0-7] */\o* 1037071d4279SBram Moolenaar\O non-octal digit: [^0-7] */\O* 1038071d4279SBram Moolenaar\w word character: [0-9A-Za-z_] */\w* 1039071d4279SBram Moolenaar\W non-word character: [^0-9A-Za-z_] */\W* 1040071d4279SBram Moolenaar\h head of word character: [A-Za-z_] */\h* 1041071d4279SBram Moolenaar\H non-head of word character: [^A-Za-z_] */\H* 1042071d4279SBram Moolenaar\a alphabetic character: [A-Za-z] */\a* 1043071d4279SBram Moolenaar\A non-alphabetic character: [^A-Za-z] */\A* 1044071d4279SBram Moolenaar\l lowercase character: [a-z] */\l* 1045071d4279SBram Moolenaar\L non-lowercase character: [^a-z] */\L* 1046071d4279SBram Moolenaar\u uppercase character: [A-Z] */\u* 1047f1568ecaSBram Moolenaar\U non-uppercase character: [^A-Z] */\U* 1048071d4279SBram Moolenaar 1049071d4279SBram Moolenaar NOTE: Using the atom is faster than the [] form. 1050071d4279SBram Moolenaar 1051071d4279SBram Moolenaar NOTE: 'ignorecase', "\c" and "\C" are not used by character classes. 1052071d4279SBram Moolenaar 1053071d4279SBram Moolenaar */\_* *E63* */\_i* */\_I* */\_k* */\_K* */\_f* */\_F* 1054071d4279SBram Moolenaar */\_p* */\_P* */\_s* */\_S* */\_d* */\_D* */\_x* */\_X* 1055071d4279SBram Moolenaar */\_o* */\_O* */\_w* */\_W* */\_h* */\_H* */\_a* */\_A* 1056071d4279SBram Moolenaar */\_l* */\_L* */\_u* */\_U* 1057071d4279SBram Moolenaar\_x Where "x" is any of the characters above: The character class with 1058071d4279SBram Moolenaar end-of-line added 1059071d4279SBram Moolenaar(end of character classes) 1060071d4279SBram Moolenaar 1061071d4279SBram Moolenaar\e matches <Esc> */\e* 1062071d4279SBram Moolenaar\t matches <Tab> */\t* 1063071d4279SBram Moolenaar\r matches <CR> */\r* 1064071d4279SBram Moolenaar\b matches <BS> */\b* 1065071d4279SBram Moolenaar\n matches an end-of-line */\n* 1066071d4279SBram Moolenaar When matching in a string instead of buffer text a literal newline 1067071d4279SBram Moolenaar character is matched. 1068071d4279SBram Moolenaar 1069071d4279SBram Moolenaar~ matches the last given substitute string */~* */\~* 1070071d4279SBram Moolenaar 1071071d4279SBram Moolenaar\(\) A pattern enclosed by escaped parentheses. */\(* */\(\)* */\)* 1072fbc0d2eaSBram Moolenaar E.g., "\(^a\)" matches 'a' at the start of a line. 1073fbc0d2eaSBram Moolenaar *E51* *E54* *E55* *E872* *E873* 1074071d4279SBram Moolenaar 1075071d4279SBram Moolenaar\1 Matches the same string that was matched by */\1* *E65* 107625c9c680SBram Moolenaar the first sub-expression in \( and \). 1077071d4279SBram Moolenaar Example: "\([a-z]\).\1" matches "ata", "ehe", "tot", etc. 1078071d4279SBram Moolenaar\2 Like "\1", but uses second sub-expression, */\2* 1079071d4279SBram Moolenaar ... */\3* 1080071d4279SBram Moolenaar\9 Like "\1", but uses ninth sub-expression. */\9* 1081071d4279SBram Moolenaar Note: The numbering of groups is done based on which "\(" comes first 1082071d4279SBram Moolenaar in the pattern (going left to right), NOT based on what is matched 1083071d4279SBram Moolenaar first. 1084071d4279SBram Moolenaar 1085071d4279SBram Moolenaar\%(\) A pattern enclosed by escaped parentheses. */\%(\)* */\%(* *E53* 1086071d4279SBram Moolenaar Just like \(\), but without counting it as a sub-expression. This 1087071d4279SBram Moolenaar allows using more groups and it's a little bit faster. 1088071d4279SBram Moolenaar 1089071d4279SBram Moolenaarx A single character, with no special meaning, matches itself 1090071d4279SBram Moolenaar 1091071d4279SBram Moolenaar */\* */\\* 1092071d4279SBram Moolenaar\x A backslash followed by a single character, with no special meaning, 1093071d4279SBram Moolenaar is reserved for future expansions 1094071d4279SBram Moolenaar 1095071d4279SBram Moolenaar[] (with 'nomagic': \[]) */[]* */\[]* */\_[]* */collection* 1096071d4279SBram Moolenaar\_[] 10971b884a00SBram Moolenaar A collection. This is a sequence of characters enclosed in square 10981b884a00SBram Moolenaar brackets. It matches any single character in the collection. 1099071d4279SBram Moolenaar Example matches ~ 1100071d4279SBram Moolenaar [xyz] any 'x', 'y' or 'z' 1101071d4279SBram Moolenaar [a-zA-Z]$ any alphabetic character at the end of a line 1102071d4279SBram Moolenaar \c[a-z]$ same 1103a3e6bc93SBram Moolenaar [А-яЁё] Russian alphabet (with utf-8 and cp1251) 1104a3e6bc93SBram Moolenaar 1105c81e5e79SBram Moolenaar */[\n]* 1106071d4279SBram Moolenaar With "\_" prepended the collection also includes the end-of-line. 1107071d4279SBram Moolenaar The same can be done by including "\n" in the collection. The 1108071d4279SBram Moolenaar end-of-line is also matched when the collection starts with "^"! Thus 1109071d4279SBram Moolenaar "\_[^ab]" matches the end-of-line and any character but "a" and "b". 1110071d4279SBram Moolenaar This makes it Vi compatible: Without the "\_" or "\n" the collection 1111071d4279SBram Moolenaar does not match an end-of-line. 11128aff23a1SBram Moolenaar *E769* 1113ae5bce1cSBram Moolenaar When the ']' is not there Vim will not give an error message but 11148aff23a1SBram Moolenaar assume no collection is used. Useful to search for '['. However, you 11155837f1f4SBram Moolenaar do get E769 for internal searching. And be aware that in a 11165837f1f4SBram Moolenaar `:substitute` command the whole command becomes the pattern. E.g. 11175837f1f4SBram Moolenaar ":s/[/x/" searches for "[/x" and replaces it with nothing. It does 11185837f1f4SBram Moolenaar not search for "[" and replaces it with "x"! 1119ae5bce1cSBram Moolenaar 11203ec574f2SBram Moolenaar *E944* *E945* 1121071d4279SBram Moolenaar If the sequence begins with "^", it matches any single character NOT 1122071d4279SBram Moolenaar in the collection: "[^xyz]" matches anything but 'x', 'y' and 'z'. 1123071d4279SBram Moolenaar - If two characters in the sequence are separated by '-', this is 1124071d4279SBram Moolenaar shorthand for the full list of ASCII characters between them. E.g., 11253ec574f2SBram Moolenaar "[0-9]" matches any decimal digit. If the starting character exceeds 11263ec574f2SBram Moolenaar the ending character, e.g. [c-a], E944 occurs. Non-ASCII characters 11273ec574f2SBram Moolenaar can be used, but the character values must not be more than 256 apart 11283ec574f2SBram Moolenaar in the old regexp engine. For example, searching by [\u3000-\u4000] 11293ec574f2SBram Moolenaar after setting re=1 emits a E945 error. Prepending \%#=2 will fix it. 1130071d4279SBram Moolenaar - A character class expression is evaluated to the set of characters 1131071d4279SBram Moolenaar belonging to that character class. The following character classes 1132071d4279SBram Moolenaar are supported: 11330c078fc7SBram Moolenaar Name Func Contents ~ 11340c078fc7SBram Moolenaar*[:alnum:]* [:alnum:] isalnum ASCII letters and digits 11350c078fc7SBram Moolenaar*[:alpha:]* [:alpha:] isalpha ASCII letters 11360c078fc7SBram Moolenaar*[:blank:]* [:blank:] space and tab 11370c078fc7SBram Moolenaar*[:cntrl:]* [:cntrl:] iscntrl ASCII control characters 11380c078fc7SBram Moolenaar*[:digit:]* [:digit:] decimal digits '0' to '9' 11390c078fc7SBram Moolenaar*[:graph:]* [:graph:] isgraph ASCII printable characters excluding 11400c078fc7SBram Moolenaar space 11410c078fc7SBram Moolenaar*[:lower:]* [:lower:] (1) lowercase letters (all letters when 1142071d4279SBram Moolenaar 'ignorecase' is used) 11430c078fc7SBram Moolenaar*[:print:]* [:print:] (2) printable characters including space 11440c078fc7SBram Moolenaar*[:punct:]* [:punct:] ispunct ASCII punctuation characters 11450c078fc7SBram Moolenaar*[:space:]* [:space:] whitespace characters: space, tab, CR, 11460c078fc7SBram Moolenaar NL, vertical tab, form feed 11470c078fc7SBram Moolenaar*[:upper:]* [:upper:] (3) uppercase letters (all letters when 1148071d4279SBram Moolenaar 'ignorecase' is used) 11490c078fc7SBram Moolenaar*[:xdigit:]* [:xdigit:] hexadecimal digits: 0-9, a-f, A-F 1150071d4279SBram Moolenaar*[:return:]* [:return:] the <CR> character 1151071d4279SBram Moolenaar*[:tab:]* [:tab:] the <Tab> character 1152071d4279SBram Moolenaar*[:escape:]* [:escape:] the <Esc> character 1153071d4279SBram Moolenaar*[:backspace:]* [:backspace:] the <BS> character 1154221cd9f4SBram Moolenaar*[:ident:]* [:ident:] identifier character (same as "\i") 1155221cd9f4SBram Moolenaar*[:keyword:]* [:keyword:] keyword character (same as "\k") 1156221cd9f4SBram Moolenaar*[:fname:]* [:fname:] file name character (same as "\f") 11571b884a00SBram Moolenaar The square brackets in character class expressions are additional to 11581b884a00SBram Moolenaar the square brackets delimiting a collection. For example, the 11591b884a00SBram Moolenaar following is a plausible pattern for a UNIX filename: 11601b884a00SBram Moolenaar "[-./[:alnum:]_~]\+". That is, a list of at least one character, 11611b884a00SBram Moolenaar each of which is either '-', '.', '/', alphabetic, numeric, '_' or 11621b884a00SBram Moolenaar '~'. 1163fa735342SBram Moolenaar These items only work for 8-bit characters, except [:lower:] and 1164207f0093SBram Moolenaar [:upper:] also work for multibyte characters when using the new 116503413f44SBram Moolenaar regexp engine. See |two-engines|. In the future these items may 1166207f0093SBram Moolenaar work for multibyte characters. For now, to get all "alpha" 116706481427SBram Moolenaar characters you can use: [[:lower:][:upper:]]. 11680c078fc7SBram Moolenaar 11690c078fc7SBram Moolenaar The "Func" column shows what library function is used. The 11700c078fc7SBram Moolenaar implementation depends on the system. Otherwise: 11710c078fc7SBram Moolenaar (1) Uses islower() for ASCII and Vim builtin rules for other 11724c92e75dSBram Moolenaar characters. 11730c078fc7SBram Moolenaar (2) Uses Vim builtin rules 11740c078fc7SBram Moolenaar (3) As with (1) but using isupper() 117526a60b45SBram Moolenaar */[[=* *[==]* 117626a60b45SBram Moolenaar - An equivalence class. This means that characters are matched that 1177522f9aebSBram Moolenaar have almost the same meaning, e.g., when ignoring accents. This 1178522f9aebSBram Moolenaar only works for Unicode, latin1 and latin9. The form is: 117926a60b45SBram Moolenaar [=a=] 118026a60b45SBram Moolenaar */[[.* *[..]* 118126a60b45SBram Moolenaar - A collation element. This currently simply accepts a single 118226a60b45SBram Moolenaar character in the form: 118326a60b45SBram Moolenaar [.a.] 1184071d4279SBram Moolenaar */\]* 1185071d4279SBram Moolenaar - To include a literal ']', '^', '-' or '\' in the collection, put a 1186071d4279SBram Moolenaar backslash before it: "[xyz\]]", "[\^xyz]", "[xy\-z]" and "[xyz\\]". 1187071d4279SBram Moolenaar (Note: POSIX does not support the use of a backslash this way). For 1188071d4279SBram Moolenaar ']' you can also make it the first character (following a possible 118925c9c680SBram Moolenaar "^"): "[]xyz]" or "[^]xyz]". 1190071d4279SBram Moolenaar For '-' you can also make it the first or last character: "[-xyz]", 1191071d4279SBram Moolenaar "[^-xyz]" or "[xyz-]". For '\' you can also let it be followed by 11920bc380a9SBram Moolenaar any character that's not in "^]-\bdertnoUux". "[\xyz]" matches '\', 11930bc380a9SBram Moolenaar 'x', 'y' and 'z'. It's better to use "\\" though, future expansions 11940bc380a9SBram Moolenaar may use other characters after '\'. 1195ff034194SBram Moolenaar - Omitting the trailing ] is not considered an error. "[]" works like 1196ff034194SBram Moolenaar "[]]", it matches the ']' character. 1197071d4279SBram Moolenaar - The following translations are accepted when the 'l' flag is not 119825c9c680SBram Moolenaar included in 'cpoptions': 1199071d4279SBram Moolenaar \e <Esc> 1200071d4279SBram Moolenaar \t <Tab> 1201071d4279SBram Moolenaar \r <CR> (NOT end-of-line!) 1202071d4279SBram Moolenaar \b <BS> 1203c81e5e79SBram Moolenaar \n line break, see above |/[\n]| 1204c0197e28SBram Moolenaar \d123 decimal number of character 120582be4849SBram Moolenaar \o40 octal number of character up to 0o377 1206c0197e28SBram Moolenaar \x20 hexadecimal number of character up to 0xff 1207c0197e28SBram Moolenaar \u20AC hex. number of multibyte character up to 0xffff 1208c0197e28SBram Moolenaar \U1234 hex. number of multibyte character up to 0xffffffff 1209071d4279SBram Moolenaar NOTE: The other backslash codes mentioned above do not work inside 1210071d4279SBram Moolenaar []! 1211071d4279SBram Moolenaar - Matching with a collection can be slow, because each character in 1212071d4279SBram Moolenaar the text has to be compared with each character in the collection. 1213071d4279SBram Moolenaar Use one of the other atoms above when possible. Example: "\d" is 121498ef233eSBram Moolenaar much faster than "[0-9]" and matches the same characters. However, 121598ef233eSBram Moolenaar the new |NFA| regexp engine deals with this better than the old one. 1216071d4279SBram Moolenaar 1217071d4279SBram Moolenaar */\%[]* *E69* *E70* *E369* 1218c0197e28SBram Moolenaar\%[] A sequence of optionally matched atoms. This always matches. 1219071d4279SBram Moolenaar It matches as much of the list of atoms it contains as possible. Thus 1220071d4279SBram Moolenaar it stops at the first atom that doesn't match. For example: > 1221071d4279SBram Moolenaar /r\%[ead] 1222071d4279SBram Moolenaar< matches "r", "re", "rea" or "read". The longest that matches is used. 1223071d4279SBram Moolenaar To match the Ex command "function", where "fu" is required and 1224071d4279SBram Moolenaar "nction" is optional, this would work: > 1225071d4279SBram Moolenaar /\<fu\%[nction]\> 1226071d4279SBram Moolenaar< The end-of-word atom "\>" is used to avoid matching "fu" in "full". 1227071d4279SBram Moolenaar It gets more complicated when the atoms are not ordinary characters. 1228071d4279SBram Moolenaar You don't often have to use it, but it is possible. Example: > 1229071d4279SBram Moolenaar /\<r\%[[eo]ad]\> 1230071d4279SBram Moolenaar< Matches the words "r", "re", "ro", "rea", "roa", "read" and "road". 1231c81e5e79SBram Moolenaar There can be no \(\), \%(\) or \z(\) items inside the [] and \%[] does 1232c81e5e79SBram Moolenaar not nest. 12333577c6faSBram Moolenaar To include a "[" use "[[]" and for "]" use []]", e.g.,: > 12343577c6faSBram Moolenaar /index\%[[[]0[]]] 12353577c6faSBram Moolenaar< matches "index" "index[", "index[0" and "index[0]". 1236db84e459SBram Moolenaar {not available when compiled without the |+syntax| feature} 1237071d4279SBram Moolenaar 1238677ee689SBram Moolenaar */\%d* */\%x* */\%o* */\%u* */\%U* *E678* 1239c0197e28SBram Moolenaar 1240c0197e28SBram Moolenaar\%d123 Matches the character specified with a decimal number. Must be 1241c0197e28SBram Moolenaar followed by a non-digit. 12422346a637SBram Moolenaar\%o40 Matches the character specified with an octal number up to 0o377. 124382be4849SBram Moolenaar Numbers below 0o40 must be followed by a non-octal digit or a 124482be4849SBram Moolenaar non-digit. 1245c0197e28SBram Moolenaar\%x2a Matches the character specified with up to two hexadecimal characters. 1246c0197e28SBram Moolenaar\%u20AC Matches the character specified with up to four hexadecimal 1247c0197e28SBram Moolenaar characters. 1248c0197e28SBram Moolenaar\%U1234abcd Matches the character specified with up to eight hexadecimal 1249f6b40109SBram Moolenaar characters, up to 0x7fffffff 1250071d4279SBram Moolenaar 1251071d4279SBram Moolenaar============================================================================== 1252071d4279SBram Moolenaar7. Ignoring case in a pattern */ignorecase* 1253071d4279SBram Moolenaar 1254071d4279SBram MoolenaarIf the 'ignorecase' option is on, the case of normal letters is ignored. 1255071d4279SBram Moolenaar'smartcase' can be set to ignore case when the pattern contains lowercase 1256071d4279SBram Moolenaarletters only. 1257071d4279SBram Moolenaar */\c* */\C* 1258071d4279SBram MoolenaarWhen "\c" appears anywhere in the pattern, the whole pattern is handled like 1259071d4279SBram Moolenaar'ignorecase' is on. The actual value of 'ignorecase' and 'smartcase' is 1260071d4279SBram Moolenaarignored. "\C" does the opposite: Force matching case for the whole pattern. 1261071d4279SBram Moolenaar{only Vim supports \c and \C} 1262071d4279SBram MoolenaarNote that 'ignorecase', "\c" and "\C" are not used for the character classes. 1263071d4279SBram Moolenaar 1264071d4279SBram MoolenaarExamples: 1265071d4279SBram Moolenaar pattern 'ignorecase' 'smartcase' matches ~ 1266071d4279SBram Moolenaar foo off - foo 1267071d4279SBram Moolenaar foo on - foo Foo FOO 1268071d4279SBram Moolenaar Foo on off foo Foo FOO 1269071d4279SBram Moolenaar Foo on on Foo 1270071d4279SBram Moolenaar \cfoo - - foo Foo FOO 1271071d4279SBram Moolenaar foo\C - - foo 1272071d4279SBram Moolenaar 1273071d4279SBram MoolenaarTechnical detail: *NL-used-for-Nul* 1274071d4279SBram Moolenaar<Nul> characters in the file are stored as <NL> in memory. In the display 1275071d4279SBram Moolenaarthey are shown as "^@". The translation is done when reading and writing 1276071d4279SBram Moolenaarfiles. To match a <Nul> with a search pattern you can just enter CTRL-@ or 1277071d4279SBram Moolenaar"CTRL-V 000". This is probably just what you expect. Internally the 1278071d4279SBram Moolenaarcharacter is replaced with a <NL> in the search pattern. What is unusual is 1279071d4279SBram Moolenaarthat typing CTRL-V CTRL-J also inserts a <NL>, thus also searches for a <Nul> 128025c9c680SBram Moolenaarin the file. 1281071d4279SBram Moolenaar 1282071d4279SBram Moolenaar *CR-used-for-NL* 1283071d4279SBram MoolenaarWhen 'fileformat' is "mac", <NL> characters in the file are stored as <CR> 1284e37d50a5SBram Moolenaarcharacters internally. In the text they are shown as "^J". Otherwise this 1285071d4279SBram Moolenaarworks similar to the usage of <NL> for a <Nul>. 1286071d4279SBram Moolenaar 1287071d4279SBram MoolenaarWhen working with expression evaluation, a <NL> character in the pattern 1288071d4279SBram Moolenaarmatches a <NL> in the string. The use of "\n" (backslash n) to match a <NL> 1289071d4279SBram Moolenaardoesn't work there, it only works to match text in the buffer. 1290071d4279SBram Moolenaar 1291207f0093SBram Moolenaar *pattern-multi-byte* *pattern-multibyte* 1292207f0093SBram MoolenaarPatterns will also work with multibyte characters, mostly as you would 1293071d4279SBram Moolenaarexpect. But invalid bytes may cause trouble, a pattern with an invalid byte 1294071d4279SBram Moolenaarwill probably never match. 1295071d4279SBram Moolenaar 1296071d4279SBram Moolenaar============================================================================== 1297362e1a30SBram Moolenaar8. Composing characters *patterns-composing* 1298362e1a30SBram Moolenaar 1299362e1a30SBram Moolenaar */\Z* 13008df5acfdSBram MoolenaarWhen "\Z" appears anywhere in the pattern, all composing characters are 13018df5acfdSBram Moolenaarignored. Thus only the base characters need to match, the composing 13028df5acfdSBram Moolenaarcharacters may be different and the number of composing characters may differ. 13038df5acfdSBram MoolenaarOnly relevant when 'encoding' is "utf-8". 1304543b7ef7SBram MoolenaarException: If the pattern starts with one or more composing characters, these 1305543b7ef7SBram Moolenaarmust match. 13068df5acfdSBram Moolenaar */\%C* 13078df5acfdSBram MoolenaarUse "\%C" to skip any composing characters. For example, the pattern "a" does 13088df5acfdSBram Moolenaarnot match in "càt" (where the a has the composing character 0x0300), but 13098df5acfdSBram Moolenaar"a\%C" does. Note that this does not match "cát" (where the á is character 13108df5acfdSBram Moolenaar0xe1, it does not have a compositing character). It does match "cat" (where 13118df5acfdSBram Moolenaarthe a is just an a). 1312362e1a30SBram Moolenaar 13137ff78465SBram MoolenaarWhen a composing character appears at the start of the pattern or after an 1314362e1a30SBram Moolenaaritem that doesn't include the composing character, a match is found at any 1315362e1a30SBram Moolenaarcharacter that includes this composing character. 1316362e1a30SBram Moolenaar 1317362e1a30SBram MoolenaarWhen using a dot and a composing character, this works the same as the 1318362e1a30SBram Moolenaarcomposing character by itself, except that it doesn't matter what comes before 1319362e1a30SBram Moolenaarthis. 1320362e1a30SBram Moolenaar 1321543b7ef7SBram MoolenaarThe order of composing characters does not matter. Also, the text may have 1322543b7ef7SBram Moolenaarmore composing characters than the pattern, it still matches. But all 1323543b7ef7SBram Moolenaarcomposing characters in the pattern must be found in the text. 1324543b7ef7SBram Moolenaar 1325543b7ef7SBram MoolenaarSuppose B is a base character and x and y are composing characters: 1326543b7ef7SBram Moolenaar pattern text match ~ 1327543b7ef7SBram Moolenaar Bxy Bxy yes (perfect match) 1328543b7ef7SBram Moolenaar Bxy Byx yes (order ignored) 1329543b7ef7SBram Moolenaar Bxy By no (x missing) 1330543b7ef7SBram Moolenaar Bxy Bx no (y missing) 1331203d04d7SBram Moolenaar Bx Bx yes (perfect match) 1332543b7ef7SBram Moolenaar Bx By no (x missing) 1333543b7ef7SBram Moolenaar Bx Bxy yes (extra y ignored) 1334543b7ef7SBram Moolenaar Bx Byx yes (extra y ignored) 1335362e1a30SBram Moolenaar 1336362e1a30SBram Moolenaar============================================================================== 1337362e1a30SBram Moolenaar9. Compare with Perl patterns *perl-patterns* 1338071d4279SBram Moolenaar 1339071d4279SBram MoolenaarVim's regexes are most similar to Perl's, in terms of what you can do. The 1340071d4279SBram Moolenaardifference between them is mostly just notation; here's a summary of where 1341071d4279SBram Moolenaarthey differ: 1342071d4279SBram Moolenaar 1343071d4279SBram MoolenaarCapability in Vimspeak in Perlspeak ~ 1344071d4279SBram Moolenaar---------------------------------------------------------------- 1345071d4279SBram Moolenaarforce case insensitivity \c (?i) 1346071d4279SBram Moolenaarforce case sensitivity \C (?-i) 1347362e1a30SBram Moolenaarbackref-less grouping \%(atom\) (?:atom) 1348071d4279SBram Moolenaarconservative quantifiers \{-n,m} *?, +?, ??, {}? 1349071d4279SBram Moolenaar0-width match atom\@= (?=atom) 1350071d4279SBram Moolenaar0-width non-match atom\@! (?!atom) 1351071d4279SBram Moolenaar0-width preceding match atom\@<= (?<=atom) 1352071d4279SBram Moolenaar0-width preceding non-match atom\@<! (?<!atom) 1353071d4279SBram Moolenaarmatch without retry atom\@> (?>atom) 1354071d4279SBram Moolenaar 1355071d4279SBram MoolenaarVim and Perl handle newline characters inside a string a bit differently: 1356071d4279SBram Moolenaar 1357071d4279SBram MoolenaarIn Perl, ^ and $ only match at the very beginning and end of the text, 1358071d4279SBram Moolenaarby default, but you can set the 'm' flag, which lets them match at 1359071d4279SBram Moolenaarembedded newlines as well. You can also set the 's' flag, which causes 1360071d4279SBram Moolenaara . to match newlines as well. (Both these flags can be changed inside 1361071d4279SBram Moolenaara pattern using the same syntax used for the i flag above, BTW.) 1362071d4279SBram Moolenaar 1363071d4279SBram MoolenaarOn the other hand, Vim's ^ and $ always match at embedded newlines, and 1364071d4279SBram Moolenaaryou get two separate atoms, \%^ and \%$, which only match at the very 1365071d4279SBram Moolenaarstart and end of the text, respectively. Vim solves the second problem 1366071d4279SBram Moolenaarby giving you the \_ "modifier": put it in front of a . or a character 1367071d4279SBram Moolenaarclass, and they will match newlines as well. 1368071d4279SBram Moolenaar 1369071d4279SBram MoolenaarFinally, these constructs are unique to Perl: 1370071d4279SBram Moolenaar- execution of arbitrary code in the regex: (?{perl code}) 1371071d4279SBram Moolenaar- conditional expressions: (?(condition)true-expr|false-expr) 1372071d4279SBram Moolenaar 1373071d4279SBram Moolenaar...and these are unique to Vim: 1374071d4279SBram Moolenaar- changing the magic-ness of a pattern: \v \V \m \M 1375071d4279SBram Moolenaar (very useful for avoiding backslashitis) 1376071d4279SBram Moolenaar- sequence of optionally matching atoms: \%[atoms] 1377071d4279SBram Moolenaar- \& (which is to \| what "and" is to "or"; it forces several branches 1378071d4279SBram Moolenaar to match at one spot) 1379071d4279SBram Moolenaar- matching lines/columns by number: \%5l \%5c \%5v 1380362e1a30SBram Moolenaar- setting the start and end of the match: \zs \ze 1381071d4279SBram Moolenaar 1382071d4279SBram Moolenaar============================================================================== 1383362e1a30SBram Moolenaar10. Highlighting matches *match-highlight* 1384071d4279SBram Moolenaar 1385071d4279SBram Moolenaar *:mat* *:match* 1386071d4279SBram Moolenaar:mat[ch] {group} /{pattern}/ 1387071d4279SBram Moolenaar Define a pattern to highlight in the current window. It will 1388071d4279SBram Moolenaar be highlighted with {group}. Example: > 1389071d4279SBram Moolenaar :highlight MyGroup ctermbg=green guibg=green 1390071d4279SBram Moolenaar :match MyGroup /TODO/ 1391071d4279SBram Moolenaar< Instead of // any character can be used to mark the start and 1392071d4279SBram Moolenaar end of the {pattern}. Watch out for using special characters, 1393071d4279SBram Moolenaar such as '"' and '|'. 1394fd2ac767SBram Moolenaar 1395071d4279SBram Moolenaar {group} must exist at the moment this command is executed. 1396fd2ac767SBram Moolenaar 1397fd2ac767SBram Moolenaar The {group} highlighting still applies when a character is 13986ee10162SBram Moolenaar to be highlighted for 'hlsearch', as the highlighting for 13996ee10162SBram Moolenaar matches is given higher priority than that of 'hlsearch'. 14006ee10162SBram Moolenaar Syntax highlighting (see 'syntax') is also overruled by 14016ee10162SBram Moolenaar matches. 1402fd2ac767SBram Moolenaar 1403071d4279SBram Moolenaar Note that highlighting the last used search pattern with 1404071d4279SBram Moolenaar 'hlsearch' is used in all windows, while the pattern defined 1405071d4279SBram Moolenaar with ":match" only exists in the current window. It is kept 1406071d4279SBram Moolenaar when switching to another buffer. 1407fd2ac767SBram Moolenaar 1408fd2ac767SBram Moolenaar 'ignorecase' does not apply, use |/\c| in the pattern to 1409fd2ac767SBram Moolenaar ignore case. Otherwise case is not ignored. 1410fd2ac767SBram Moolenaar 14113577c6faSBram Moolenaar 'redrawtime' defines the maximum time searched for pattern 14123577c6faSBram Moolenaar matches. 14133577c6faSBram Moolenaar 1414c81e5e79SBram Moolenaar When matching end-of-line and Vim redraws only part of the 1415c81e5e79SBram Moolenaar display you may get unexpected results. That is because Vim 1416c81e5e79SBram Moolenaar looks for a match in the line where redrawing starts. 1417c81e5e79SBram Moolenaar 14186ee10162SBram Moolenaar Also see |matcharg()| and |getmatches()|. The former returns 14196ee10162SBram Moolenaar the highlight group and pattern of a previous |:match| 14206ee10162SBram Moolenaar command. The latter returns a list with highlight groups and 14216ee10162SBram Moolenaar patterns defined by both |matchadd()| and |:match|. 14226ee10162SBram Moolenaar 14236ee10162SBram Moolenaar Highlighting matches using |:match| are limited to three 14246ee10162SBram Moolenaar matches (aside from |:match|, |:2match| and |:3match| are 14256ee10162SBram Moolenaar available). |matchadd()| does not have this limitation and in 14266ee10162SBram Moolenaar addition makes it possible to prioritize matches. 1427910f66f9SBram Moolenaar 1428071d4279SBram Moolenaar Another example, which highlights all characters in virtual 1429071d4279SBram Moolenaar column 72 and more: > 1430071d4279SBram Moolenaar :highlight rightMargin term=bold ctermfg=blue guifg=blue 1431071d4279SBram Moolenaar :match rightMargin /.\%>72v/ 1432071d4279SBram Moolenaar< To highlight all character that are in virtual column 7: > 1433071d4279SBram Moolenaar :highlight col8 ctermbg=grey guibg=grey 1434071d4279SBram Moolenaar :match col8 /\%<8v.\%>7v/ 1435071d4279SBram Moolenaar< Note the use of two items to also match a character that 1436071d4279SBram Moolenaar occupies more than one virtual column, such as a TAB. 1437071d4279SBram Moolenaar 1438071d4279SBram Moolenaar:mat[ch] 1439071d4279SBram Moolenaar:mat[ch] none 1440071d4279SBram Moolenaar Clear a previously defined match pattern. 1441071d4279SBram Moolenaar 1442fd2ac767SBram Moolenaar 1443910f66f9SBram Moolenaar:2mat[ch] {group} /{pattern}/ *:2match* 1444fd2ac767SBram Moolenaar:2mat[ch] 1445fd2ac767SBram Moolenaar:2mat[ch] none 1446910f66f9SBram Moolenaar:3mat[ch] {group} /{pattern}/ *:3match* 1447fd2ac767SBram Moolenaar:3mat[ch] 1448fd2ac767SBram Moolenaar:3mat[ch] none 1449fd2ac767SBram Moolenaar Just like |:match| above, but set a separate match. Thus 1450fd2ac767SBram Moolenaar there can be three matches active at the same time. The match 1451fd2ac767SBram Moolenaar with the lowest number has priority if several match at the 1452fd2ac767SBram Moolenaar same position. 1453fd2ac767SBram Moolenaar The ":3match" command is used by the |matchparen| plugin. You 1454fd2ac767SBram Moolenaar are suggested to use ":match" for manual matching and 1455fd2ac767SBram Moolenaar ":2match" for another plugin. 1456fd2ac767SBram Moolenaar 14573ec3217fSBram Moolenaar============================================================================== 14583ec3217fSBram Moolenaar11. Fuzzy matching *fuzzy-match* 14593ec3217fSBram Moolenaar 14603ec3217fSBram MoolenaarFuzzy matching refers to matching strings using a non-exact search string. 14613ec3217fSBram MoolenaarFuzzy matching will match a string, if all the characters in the search string 14623ec3217fSBram Moolenaarare present anywhere in the string in the same order. Case is ignored. In a 14633ec3217fSBram Moolenaarmatched string, other characters can be present between two consecutive 14643ec3217fSBram Moolenaarcharacters in the search string. If the search string has multiple words, then 14653ec3217fSBram Moolenaareach word is matched separately. So the words in the search string can be 14663ec3217fSBram Moolenaarpresent in any order in a string. 14673ec3217fSBram Moolenaar 14683ec3217fSBram MoolenaarFuzzy matching assigns a score for each matched string based on the following 14693ec3217fSBram Moolenaarcriteria: 14703ec3217fSBram Moolenaar - The number of sequentially matching characters. 14713ec3217fSBram Moolenaar - The number of characters (distance) between two consecutive matching 14723ec3217fSBram Moolenaar characters. 14733ec3217fSBram Moolenaar - Matches at the beginning of a word 147453f7fcccSBram Moolenaar - Matches at a camel case character (e.g. Case in CamelCase) 147553f7fcccSBram Moolenaar - Matches after a path separator or a hyphen. 14763ec3217fSBram Moolenaar - The number of unmatched characters in a string. 14773ec3217fSBram MoolenaarThe matching string with the highest score is returned first. 14783ec3217fSBram Moolenaar 14793ec3217fSBram MoolenaarFor example, when you search for the "get pat" string using fuzzy matching, it 14803ec3217fSBram Moolenaarwill match the strings "GetPattern", "PatternGet", "getPattern", "patGetter", 14813ec3217fSBram Moolenaar"getSomePattern", "MatchpatternGet" etc. 14823ec3217fSBram Moolenaar 14833ec3217fSBram MoolenaarThe functions |matchfuzzy()| and |matchfuzzypos()| can be used to fuzzy search 14843ec3217fSBram Moolenaara string in a List of strings. The matchfuzzy() function returns a List of 14853ec3217fSBram Moolenaarmatching strings. The matchfuzzypos() functions returns the List of matches, 14863ec3217fSBram Moolenaarthe matching positions and the fuzzy match scores. 14873ec3217fSBram Moolenaar 14883ec3217fSBram MoolenaarThe "f" flag of `:vimgrep` enables fuzzy matching. 14893ec3217fSBram Moolenaar 14903ec3217fSBram Moolenaar 1491fd2ac767SBram Moolenaar 149291f84f6eSBram Moolenaar vim:tw=78:ts=8:noet:ft=help:norl: 1493