xref: /vim-8.2.3635/runtime/doc/pattern.txt (revision 2286304c)
153f7fcccSBram Moolenaar*pattern.txt*   For Vim version 8.2.  Last change: 2021 Jul 16
2071d4279SBram Moolenaar
3071d4279SBram Moolenaar
4071d4279SBram Moolenaar		  VIM REFERENCE MANUAL    by Bram Moolenaar
5071d4279SBram Moolenaar
6071d4279SBram Moolenaar
7071d4279SBram MoolenaarPatterns and search commands				*pattern-searches*
8071d4279SBram Moolenaar
9071d4279SBram MoolenaarThe very basics can be found in section |03.9| of the user manual.  A few more
10071d4279SBram Moolenaarexplanations are in chapter 27 |usr_27.txt|.
11071d4279SBram Moolenaar
12071d4279SBram Moolenaar1. Search commands		|search-commands|
13071d4279SBram Moolenaar2. The definition of a pattern	|search-pattern|
14071d4279SBram Moolenaar3. Magic			|/magic|
15071d4279SBram Moolenaar4. Overview of pattern items	|pattern-overview|
16071d4279SBram Moolenaar5. Multi items			|pattern-multi-items|
17071d4279SBram Moolenaar6. Ordinary atoms		|pattern-atoms|
18071d4279SBram Moolenaar7. Ignoring case in a pattern	|/ignorecase|
19362e1a30SBram Moolenaar8. Composing characters		|patterns-composing|
20362e1a30SBram Moolenaar9. Compare with Perl patterns	|perl-patterns|
21362e1a30SBram Moolenaar10. Highlighting matches	|match-highlight|
223ec3217fSBram Moolenaar11. Fuzzy matching		|fuzzy-match|
23071d4279SBram Moolenaar
24071d4279SBram Moolenaar==============================================================================
251514667aSBram Moolenaar1. Search commands				*search-commands*
26071d4279SBram Moolenaar
27071d4279SBram Moolenaar							*/*
28071d4279SBram Moolenaar/{pattern}[/]<CR>	Search forward for the [count]'th occurrence of
29071d4279SBram Moolenaar			{pattern} |exclusive|.
30071d4279SBram Moolenaar
31071d4279SBram Moolenaar/{pattern}/{offset}<CR>	Search forward for the [count]'th occurrence of
32071d4279SBram Moolenaar			{pattern} and go |{offset}| lines up or down.
33071d4279SBram Moolenaar			|linewise|.
34071d4279SBram Moolenaar
35071d4279SBram Moolenaar							*/<CR>*
368f3f58f2SBram Moolenaar/<CR>			Search forward for the [count]'th occurrence of the
378f3f58f2SBram Moolenaar			latest used pattern |last-pattern| with latest used
388f3f58f2SBram Moolenaar			|{offset}|.
39071d4279SBram Moolenaar
408f3f58f2SBram Moolenaar//{offset}<CR>		Search forward for the [count]'th occurrence of the
418f3f58f2SBram Moolenaar			latest used pattern |last-pattern| with new
428f3f58f2SBram Moolenaar			|{offset}|.  If {offset} is empty no offset is used.
43071d4279SBram Moolenaar
44071d4279SBram Moolenaar							*?*
45071d4279SBram Moolenaar?{pattern}[?]<CR>	Search backward for the [count]'th previous
46071d4279SBram Moolenaar			occurrence of {pattern} |exclusive|.
47071d4279SBram Moolenaar
48071d4279SBram Moolenaar?{pattern}?{offset}<CR>	Search backward for the [count]'th previous
49071d4279SBram Moolenaar			occurrence of {pattern} and go |{offset}| lines up or
50071d4279SBram Moolenaar			down |linewise|.
51071d4279SBram Moolenaar
52071d4279SBram Moolenaar							*?<CR>*
538f3f58f2SBram Moolenaar?<CR>			Search backward for the [count]'th occurrence of the
548f3f58f2SBram Moolenaar			latest used pattern |last-pattern| with latest used
558f3f58f2SBram Moolenaar			|{offset}|.
56071d4279SBram Moolenaar
578f3f58f2SBram Moolenaar??{offset}<CR>		Search backward for the [count]'th occurrence of the
588f3f58f2SBram Moolenaar			latest used pattern |last-pattern| with new
598f3f58f2SBram Moolenaar			|{offset}|.  If {offset} is empty no offset is used.
60071d4279SBram Moolenaar
61071d4279SBram Moolenaar							*n*
62071d4279SBram Moolenaarn			Repeat the latest "/" or "?" [count] times.
632b8388bdSBram Moolenaar			If the cursor doesn't move the search is repeated with
642b8388bdSBram Moolenaar			count + 1.
6568e6560bSBram Moolenaar			|last-pattern|
66071d4279SBram Moolenaar
67071d4279SBram Moolenaar							*N*
68071d4279SBram MoolenaarN			Repeat the latest "/" or "?" [count] times in
6968e6560bSBram Moolenaar			opposite direction. |last-pattern|
70071d4279SBram Moolenaar
71071d4279SBram Moolenaar							*star* *E348* *E349*
72071d4279SBram Moolenaar*			Search forward for the [count]'th occurrence of the
73071d4279SBram Moolenaar			word nearest to the cursor.  The word used for the
74071d4279SBram Moolenaar			search is the first of:
75071d4279SBram Moolenaar				1. the keyword under the cursor |'iskeyword'|
76071d4279SBram Moolenaar				2. the first keyword after the cursor, in the
77071d4279SBram Moolenaar				   current line
78071d4279SBram Moolenaar				3. the non-blank word under the cursor
79071d4279SBram Moolenaar				4. the first non-blank word after the cursor,
80071d4279SBram Moolenaar				   in the current line
81071d4279SBram Moolenaar			Only whole keywords are searched for, like with the
8225c9c680SBram Moolenaar			command "/\<keyword\>".  |exclusive|
83071d4279SBram Moolenaar			'ignorecase' is used, 'smartcase' is not.
84071d4279SBram Moolenaar
85071d4279SBram Moolenaar							*#*
86071d4279SBram Moolenaar#			Same as "*", but search backward.  The pound sign
87071d4279SBram Moolenaar			(character 163) also works.  If the "#" key works as
88071d4279SBram Moolenaar			backspace, try using "stty erase <BS>" before starting
8925c9c680SBram Moolenaar			Vim (<BS> is CTRL-H or a real backspace).
90071d4279SBram Moolenaar
91071d4279SBram Moolenaar							*gstar*
92071d4279SBram Moolenaarg*			Like "*", but don't put "\<" and "\>" around the word.
93071d4279SBram Moolenaar			This makes the search also find matches that are not a
9425c9c680SBram Moolenaar			whole word.
95071d4279SBram Moolenaar
96071d4279SBram Moolenaar							*g#*
97071d4279SBram Moolenaarg#			Like "#", but don't put "\<" and "\>" around the word.
98071d4279SBram Moolenaar			This makes the search also find matches that are not a
9925c9c680SBram Moolenaar			whole word.
100071d4279SBram Moolenaar
101071d4279SBram Moolenaar							*gd*
102071d4279SBram Moolenaargd			Goto local Declaration.  When the cursor is on a local
103071d4279SBram Moolenaar			variable, this command will jump to its declaration.
104071d4279SBram Moolenaar			First Vim searches for the start of the current
105071d4279SBram Moolenaar			function, just like "[[".  If it is not found the
106071d4279SBram Moolenaar			search stops in line 1.  If it is found, Vim goes back
107071d4279SBram Moolenaar			until a blank line is found.  From this position Vim
108071d4279SBram Moolenaar			searches for the keyword under the cursor, like with
109071d4279SBram Moolenaar			"*", but lines that look like a comment are ignored
110071d4279SBram Moolenaar			(see 'comments' option).
111071d4279SBram Moolenaar			Note that this is not guaranteed to work, Vim does not
112071d4279SBram Moolenaar			really check the syntax, it only searches for a match
113071d4279SBram Moolenaar			with the keyword.  If included files also need to be
114071d4279SBram Moolenaar			searched use the commands listed in |include-search|.
115071d4279SBram Moolenaar			After this command |n| searches forward for the next
116071d4279SBram Moolenaar			match (not backward).
117071d4279SBram Moolenaar
118071d4279SBram Moolenaar							*gD*
119071d4279SBram MoolenaargD			Goto global Declaration.  When the cursor is on a
120071d4279SBram Moolenaar			global variable that is defined in the file, this
121071d4279SBram Moolenaar			command will jump to its declaration.  This works just
122071d4279SBram Moolenaar			like "gd", except that the search for the keyword
12325c9c680SBram Moolenaar			always starts in line 1.
124071d4279SBram Moolenaar
125f75a963eSBram Moolenaar							*1gd*
126f75a963eSBram Moolenaar1gd			Like "gd", but ignore matches inside a {} block that
12725c9c680SBram Moolenaar			ends before the cursor position.
128f75a963eSBram Moolenaar
129f75a963eSBram Moolenaar							*1gD*
130f75a963eSBram Moolenaar1gD			Like "gD", but ignore matches inside a {} block that
13125c9c680SBram Moolenaar			ends before the cursor position.
132f75a963eSBram Moolenaar
133071d4279SBram Moolenaar							*CTRL-C*
134071d4279SBram MoolenaarCTRL-C			Interrupt current (search) command.  Use CTRL-Break on
1355666fcd0SBram Moolenaar			MS-Windows |dos-CTRL-Break|.
136071d4279SBram Moolenaar			In Normal mode, any pending command is aborted.
137071d4279SBram Moolenaar
138071d4279SBram Moolenaar							*:noh* *:nohlsearch*
139071d4279SBram Moolenaar:noh[lsearch]		Stop the highlighting for the 'hlsearch' option.  It
140071d4279SBram Moolenaar			is automatically turned back on when using a search
141071d4279SBram Moolenaar			command, or setting the 'hlsearch' option.
142071d4279SBram Moolenaar			This command doesn't work in an autocommand, because
143071d4279SBram Moolenaar			the highlighting state is saved and restored when
144071d4279SBram Moolenaar			executing autocommands |autocmd-searchpat|.
1453577c6faSBram Moolenaar			Same thing for when invoking a user function.
146071d4279SBram Moolenaar
147071d4279SBram MoolenaarWhile typing the search pattern the current match will be shown if the
148071d4279SBram Moolenaar'incsearch' option is on.  Remember that you still have to finish the search
149071d4279SBram Moolenaarcommand with <CR> to actually position the cursor at the displayed match.  Or
150071d4279SBram Moolenaaruse <Esc> to abandon the search.
151071d4279SBram Moolenaar
152071d4279SBram MoolenaarAll matches for the last used search pattern will be highlighted if you set
153071d4279SBram Moolenaarthe 'hlsearch' option.  This can be suspended with the |:nohlsearch| command.
154071d4279SBram Moolenaar
1559dfa3139SBram MoolenaarWhen 'shortmess' does not include the "S" flag, Vim will automatically show an
1569dfa3139SBram Moolenaarindex, on which the cursor is. This can look like this: >
1579dfa3139SBram Moolenaar
1589dfa3139SBram Moolenaar  [1/5]		Cursor is on first of 5 matches.
1599dfa3139SBram Moolenaar  [1/>99]	Cursor is on first of more than 99 matches.
1609dfa3139SBram Moolenaar  [>99/>99]	Cursor is after 99 match of more than 99 matches.
1619dfa3139SBram Moolenaar  [?/??]	Unknown how many matches exists, generating the
1629dfa3139SBram Moolenaar		statistics was aborted because of search timeout.
1639dfa3139SBram Moolenaar
1649dfa3139SBram MoolenaarNote: the count does not take offset into account.
1659dfa3139SBram Moolenaar
1661514667aSBram MoolenaarWhen no match is found you get the error: *E486* Pattern not found
1671514667aSBram MoolenaarNote that for the |:global| command this behaves like a normal message, for Vi
1681514667aSBram Moolenaarcompatibility.  For the |:s| command the "e" flag can be used to avoid the
1691514667aSBram Moolenaarerror message |:s_flags|.
1701514667aSBram Moolenaar
171071d4279SBram Moolenaar					*search-offset* *{offset}*
172071d4279SBram MoolenaarThese commands search for the specified pattern.  With "/" and "?" an
173071d4279SBram Moolenaaradditional offset may be given.  There are two types of offsets: line offsets
17425c9c680SBram Moolenaarand character offsets.
175071d4279SBram Moolenaar
176071d4279SBram MoolenaarThe offset gives the cursor position relative to the found match:
177071d4279SBram Moolenaar    [num]	[num] lines downwards, in column 1
178071d4279SBram Moolenaar    +[num]	[num] lines downwards, in column 1
179071d4279SBram Moolenaar    -[num]	[num] lines upwards, in column 1
180071d4279SBram Moolenaar    e[+num]	[num] characters to the right of the end of the match
181071d4279SBram Moolenaar    e[-num]	[num] characters to the left of the end of the match
182071d4279SBram Moolenaar    s[+num]	[num] characters to the right of the start of the match
183071d4279SBram Moolenaar    s[-num]	[num] characters to the left of the start of the match
184071d4279SBram Moolenaar    b[+num]	[num] identical to s[+num] above (mnemonic: begin)
185071d4279SBram Moolenaar    b[-num]	[num] identical to s[-num] above (mnemonic: begin)
1861d2ba7faSBram Moolenaar    ;{pattern}  perform another search, see |//;|
187071d4279SBram Moolenaar
188071d4279SBram MoolenaarIf a '-' or '+' is given but [num] is omitted, a count of one will be used.
189071d4279SBram MoolenaarWhen including an offset with 'e', the search becomes inclusive (the
190071d4279SBram Moolenaarcharacter the cursor lands on is included in operations).
191071d4279SBram Moolenaar
192071d4279SBram MoolenaarExamples:
193071d4279SBram Moolenaar
194071d4279SBram Moolenaarpattern			cursor position	~
195071d4279SBram Moolenaar/test/+1		one line below "test", in column 1
196071d4279SBram Moolenaar/test/e			on the last t of "test"
197071d4279SBram Moolenaar/test/s+2		on the 's' of "test"
198071d4279SBram Moolenaar/test/b-3		three characters before "test"
199071d4279SBram Moolenaar
200071d4279SBram MoolenaarIf one of these commands is used after an operator, the characters between
201071d4279SBram Moolenaarthe cursor position before and after the search is affected.  However, if a
202071d4279SBram Moolenaarline offset is given, the whole lines between the two cursor positions are
203071d4279SBram Moolenaaraffected.
204071d4279SBram Moolenaar
205071d4279SBram MoolenaarAn example of how to search for matches with a pattern and change the match
206071d4279SBram Moolenaarwith another word: >
207071d4279SBram Moolenaar	/foo<CR>	find "foo"
20892dff182SBram Moolenaar	c//e<CR>	change until end of match
209071d4279SBram Moolenaar	bar<Esc>	type replacement
210071d4279SBram Moolenaar	//<CR>		go to start of next match
21192dff182SBram Moolenaar	c//e<CR>	change until end of match
212071d4279SBram Moolenaar	beep<Esc>	type another replacement
213071d4279SBram Moolenaar			etc.
214071d4279SBram Moolenaar<
215071d4279SBram Moolenaar							*//;* *E386*
216071d4279SBram MoolenaarA very special offset is ';' followed by another search command.  For example: >
217071d4279SBram Moolenaar
218071d4279SBram Moolenaar   /test 1/;/test
219071d4279SBram Moolenaar   /test.*/+1;?ing?
220071d4279SBram Moolenaar
221071d4279SBram MoolenaarThe first one first finds the next occurrence of "test 1", and then the first
222071d4279SBram Moolenaaroccurrence of "test" after that.
223071d4279SBram Moolenaar
224071d4279SBram MoolenaarThis is like executing two search commands after each other, except that:
225071d4279SBram Moolenaar- It can be used as a single motion command after an operator.
226071d4279SBram Moolenaar- The direction for a following "n" or "N" command comes from the first
227071d4279SBram Moolenaar  search command.
228071d4279SBram Moolenaar- When an error occurs the cursor is not moved at all.
229071d4279SBram Moolenaar
230071d4279SBram Moolenaar							*last-pattern*
231071d4279SBram MoolenaarThe last used pattern and offset are remembered.  They can be used to repeat
232071d4279SBram Moolenaarthe search, possibly in another direction or with another count.  Note that
2339faec4e3SBram Moolenaartwo patterns are remembered: One for "normal" search commands and one for the
234071d4279SBram Moolenaarsubstitute command ":s".  Each time an empty pattern is given, the previously
235662db673SBram Moolenaarused pattern is used.  However, if there is no previous search command, a
236662db673SBram Moolenaarprevious substitute pattern is used, if possible.
237071d4279SBram Moolenaar
238071d4279SBram MoolenaarThe 'magic' option sticks with the last used pattern.  If you change 'magic',
239071d4279SBram Moolenaarthis will not change how the last used pattern will be interpreted.
240071d4279SBram MoolenaarThe 'ignorecase' option does not do this.  When 'ignorecase' is changed, it
241071d4279SBram Moolenaarwill result in the pattern to match other text.
242071d4279SBram Moolenaar
243071d4279SBram MoolenaarAll matches for the last used search pattern will be highlighted if you set
244071d4279SBram Moolenaarthe 'hlsearch' option.
245071d4279SBram Moolenaar
246071d4279SBram MoolenaarTo clear the last used search pattern: >
247071d4279SBram Moolenaar	:let @/ = ""
248071d4279SBram MoolenaarThis will not set the pattern to an empty string, because that would match
249071d4279SBram Moolenaareverywhere.  The pattern is really cleared, like when starting Vim.
250071d4279SBram Moolenaar
2518f999f19SBram MoolenaarThe search usually skips matches that don't move the cursor.  Whether the next
252071d4279SBram Moolenaarmatch is found at the next character or after the skipped match depends on the
253071d4279SBram Moolenaar'c' flag in 'cpoptions'.  See |cpo-c|.
254071d4279SBram Moolenaar	   with 'c' flag:   "/..." advances 1 to 3 characters
255071d4279SBram Moolenaar	without 'c' flag:   "/..." advances 1 character
256071d4279SBram MoolenaarThe unpredictability with the 'c' flag is caused by starting the search in the
257071d4279SBram Moolenaarfirst column, skipping matches until one is found past the cursor position.
258071d4279SBram Moolenaar
2598f999f19SBram MoolenaarWhen searching backwards, searching starts at the start of the line, using the
2608f999f19SBram Moolenaar'c' flag in 'cpoptions' as described above.  Then the last match before the
2618f999f19SBram Moolenaarcursor position is used.
2628f999f19SBram Moolenaar
263071d4279SBram MoolenaarIn Vi the ":tag" command sets the last search pattern when the tag is searched
264071d4279SBram Moolenaarfor.  In Vim this is not done, the previous search pattern is still remembered,
265071d4279SBram Moolenaarunless the 't' flag is present in 'cpoptions'.  The search pattern is always
266071d4279SBram Moolenaarput in the search history.
267071d4279SBram Moolenaar
268071d4279SBram MoolenaarIf the 'wrapscan' option is on (which is the default), searches wrap around
269071d4279SBram Moolenaarthe end of the buffer.  If 'wrapscan' is not set, the backward search stops
270071d4279SBram Moolenaarat the beginning and the forward search stops at the end of the buffer.  If
271071d4279SBram Moolenaar'wrapscan' is set and the pattern was not found the error message "pattern
272071d4279SBram Moolenaarnot found" is given, and the cursor will not be moved.  If 'wrapscan' is not
273071d4279SBram Moolenaarset the message becomes "search hit BOTTOM without match" when searching
274071d4279SBram Moolenaarforward, or "search hit TOP without match" when searching backward.  If
275071d4279SBram Moolenaarwrapscan is set and the search wraps around the end of the file the message
276071d4279SBram Moolenaar"search hit TOP, continuing at BOTTOM" or "search hit BOTTOM, continuing at
277071d4279SBram MoolenaarTOP" is given when searching backwards or forwards respectively.  This can be
278071d4279SBram Moolenaarswitched off by setting the 's' flag in the 'shortmess' option.  The highlight
279071d4279SBram Moolenaarmethod 'w' is used for this message (default: standout).
280071d4279SBram Moolenaar
281071d4279SBram Moolenaar							*search-range*
2824770d09aSBram MoolenaarYou can limit the search command "/" to a certain range of lines by including
2834770d09aSBram Moolenaar\%>l items.  For example, to match the word "limit" below line 199 and above
2844770d09aSBram Moolenaarline 300: >
2854770d09aSBram Moolenaar	/\%>199l\%<300llimit
2864770d09aSBram MoolenaarAlso see |/\%>l|.
2874770d09aSBram Moolenaar
2884770d09aSBram MoolenaarAnother way is to use the ":substitute" command with the 'c' flag.  Example: >
289071d4279SBram Moolenaar   :.,300s/Pattern//gc
290071d4279SBram MoolenaarThis command will search from the cursor position until line 300 for
291071d4279SBram Moolenaar"Pattern".  At the match, you will be asked to type a character.  Type 'q' to
292071d4279SBram Moolenaarstop at this match, type 'n' to find the next match.
293071d4279SBram Moolenaar
294071d4279SBram MoolenaarThe "*", "#", "g*" and "g#" commands look for a word near the cursor in this
295071d4279SBram Moolenaarorder, the first one that is found is used:
296071d4279SBram Moolenaar- The keyword currently under the cursor.
297071d4279SBram Moolenaar- The first keyword to the right of the cursor, in the same line.
298071d4279SBram Moolenaar- The WORD currently under the cursor.
299071d4279SBram Moolenaar- The first WORD to the right of the cursor, in the same line.
300071d4279SBram MoolenaarThe keyword may only contain letters and characters in 'iskeyword'.
301071d4279SBram MoolenaarThe WORD may contain any non-blanks (<Tab>s and/or <Space>s).
302071d4279SBram MoolenaarNote that if you type with ten fingers, the characters are easy to remember:
303071d4279SBram Moolenaarthe "#" is under your left hand middle finger (search to the left and up) and
304071d4279SBram Moolenaarthe "*" is under your right hand middle finger (search to the right and down).
305071d4279SBram Moolenaar(this depends on your keyboard layout though).
306071d4279SBram Moolenaar
307a9604e61SBram Moolenaar								*E956*
308a9604e61SBram MoolenaarIn very rare cases a regular expression is used recursively.  This can happen
309f0d58efcSBram Moolenaarwhen executing a pattern takes a long time and when checking for messages on
310a9604e61SBram Moolenaarchannels a callback is invoked that also uses a pattern or an autocommand is
311a9604e61SBram Moolenaartriggered.  In most cases this should be fine, but if a pattern is in use when
312a9604e61SBram Moolenaarit's used again it fails.  Usually this means there is something wrong with
313a9604e61SBram Moolenaarthe pattern.
314a9604e61SBram Moolenaar
315071d4279SBram Moolenaar==============================================================================
316071d4279SBram Moolenaar2. The definition of a pattern		*search-pattern* *pattern* *[pattern]*
317071d4279SBram Moolenaar					*regular-expression* *regexp* *Pattern*
318f1f8bc5bSBram Moolenaar					*E76* *E383* *E476*
319071d4279SBram Moolenaar
320071d4279SBram MoolenaarFor starters, read chapter 27 of the user manual |usr_27.txt|.
321071d4279SBram Moolenaar
322071d4279SBram Moolenaar						*/bar* */\bar* */pattern*
323071d4279SBram Moolenaar1. A pattern is one or more branches, separated by "\|".  It matches anything
324071d4279SBram Moolenaar   that matches one of the branches.  Example: "foo\|beep" matches "foo" and
325071d4279SBram Moolenaar   matches "beep".  If more than one branch matches, the first one is used.
326071d4279SBram Moolenaar
327071d4279SBram Moolenaar   pattern ::=	    branch
328071d4279SBram Moolenaar		or  branch \| branch
329071d4279SBram Moolenaar		or  branch \| branch \| branch
330071d4279SBram Moolenaar		etc.
331071d4279SBram Moolenaar
332071d4279SBram Moolenaar						*/branch* */\&*
333071d4279SBram Moolenaar2. A branch is one or more concats, separated by "\&".  It matches the last
334071d4279SBram Moolenaar   concat, but only if all the preceding concats also match at the same
335071d4279SBram Moolenaar   position.  Examples:
336071d4279SBram Moolenaar	"foobeep\&..." matches "foo" in "foobeep".
337071d4279SBram Moolenaar	".*Peter\&.*Bob" matches in a line containing both "Peter" and "Bob"
338071d4279SBram Moolenaar
339071d4279SBram Moolenaar   branch ::=	    concat
340071d4279SBram Moolenaar		or  concat \& concat
341071d4279SBram Moolenaar		or  concat \& concat \& concat
342071d4279SBram Moolenaar		etc.
343071d4279SBram Moolenaar
344071d4279SBram Moolenaar						*/concat*
345071d4279SBram Moolenaar3. A concat is one or more pieces, concatenated.  It matches a match for the
346071d4279SBram Moolenaar   first piece, followed by a match for the second piece, etc.  Example:
347071d4279SBram Moolenaar   "f[0-9]b", first matches "f", then a digit and then "b".
348071d4279SBram Moolenaar
349071d4279SBram Moolenaar   concat  ::=	    piece
350071d4279SBram Moolenaar		or  piece piece
351071d4279SBram Moolenaar		or  piece piece piece
352071d4279SBram Moolenaar		etc.
353071d4279SBram Moolenaar
354071d4279SBram Moolenaar						*/piece*
355071d4279SBram Moolenaar4. A piece is an atom, possibly followed by a multi, an indication of how many
356071d4279SBram Moolenaar   times the atom can be matched.  Example: "a*" matches any sequence of "a"
357071d4279SBram Moolenaar   characters: "", "a", "aa", etc.  See |/multi|.
358071d4279SBram Moolenaar
359071d4279SBram Moolenaar   piece   ::=	    atom
360071d4279SBram Moolenaar		or  atom  multi
361071d4279SBram Moolenaar
362071d4279SBram Moolenaar						*/atom*
363071d4279SBram Moolenaar5. An atom can be one of a long list of items.  Many atoms match one character
364071d4279SBram Moolenaar   in the text.  It is often an ordinary character or a character class.
3651b884a00SBram Moolenaar   Parentheses can be used to make a pattern into an atom.  The "\z(\)"
3661b884a00SBram Moolenaar   construct is only for syntax highlighting.
367071d4279SBram Moolenaar
368071d4279SBram Moolenaar   atom    ::=	    ordinary-atom		|/ordinary-atom|
369071d4279SBram Moolenaar		or  \( pattern \)		|/\(|
370071d4279SBram Moolenaar		or  \%( pattern \)		|/\%(|
371071d4279SBram Moolenaar		or  \z( pattern \)		|/\z(|
372071d4279SBram Moolenaar
373071d4279SBram Moolenaar
374913df81eSBram Moolenaar				*/\%#=* *two-engines* *NFA*
375fbc0d2eaSBram MoolenaarVim includes two regexp engines:
376fbc0d2eaSBram Moolenaar1. An old, backtracking engine that supports everything.
377220adb1eSBram Moolenaar2. A new, NFA engine that works much faster on some patterns, possibly slower
378220adb1eSBram Moolenaar   on some patterns.
379fbc0d2eaSBram Moolenaar
380fbc0d2eaSBram MoolenaarVim will automatically select the right engine for you.  However, if you run
381fbc0d2eaSBram Moolenaarinto a problem or want to specifically select one engine or the other, you can
382fbc0d2eaSBram Moolenaarprepend one of the following to the pattern:
383fbc0d2eaSBram Moolenaar
384fbc0d2eaSBram Moolenaar	\%#=0	Force automatic selection.  Only has an effect when
385fbc0d2eaSBram Moolenaar	        'regexpengine' has been set to a non-zero value.
386fbc0d2eaSBram Moolenaar	\%#=1	Force using the old engine.
387fbc0d2eaSBram Moolenaar	\%#=2	Force using the NFA engine.
388fbc0d2eaSBram Moolenaar
389fbc0d2eaSBram MoolenaarYou can also use the 'regexpengine' option to change the default.
390fbc0d2eaSBram Moolenaar
391fbc0d2eaSBram Moolenaar			 *E864* *E868* *E874* *E875* *E876* *E877* *E878*
392fbc0d2eaSBram MoolenaarIf selecting the NFA engine and it runs into something that is not implemented
393fbc0d2eaSBram Moolenaarthe pattern will not match.  This is only useful when debugging Vim.
394fbc0d2eaSBram Moolenaar
395071d4279SBram Moolenaar==============================================================================
396eb3593b3SBram Moolenaar3. Magic							*/magic*
397eb3593b3SBram Moolenaar
3987e6a515eSBram MoolenaarSome characters in the pattern, such as letters, are taken literally.  They
3997e6a515eSBram Moolenaarmatch exactly the same character in the text.  When preceded with a backslash
4007e6a515eSBram Moolenaarhowever, these characters may get a special meaning.  For example, "a" matches
4017e6a515eSBram Moolenaarthe letter "a", while "\a" matches any alphabetic character.
402eb3593b3SBram Moolenaar
403eb3593b3SBram MoolenaarOther characters have a special meaning without a backslash.  They need to be
4047e6a515eSBram Moolenaarpreceded with a backslash to match literally.  For example "." matches any
4057e6a515eSBram Moolenaarcharacter while "\." matches a dot.
406eb3593b3SBram Moolenaar
407eb3593b3SBram MoolenaarIf a character is taken literally or not depends on the 'magic' option and the
4087e6a515eSBram Moolenaaritems in the pattern mentioned next.  The 'magic' option should always be set,
4097e6a515eSBram Moolenaarbut it can be switched off for Vi compatibility.  We mention the effect of
4107e6a515eSBram Moolenaar'nomagic' here for completeness, but we recommend against using that.
411eb3593b3SBram Moolenaar							*/\m* */\M*
412eb3593b3SBram MoolenaarUse of "\m" makes the pattern after it be interpreted as if 'magic' is set,
413eb3593b3SBram Moolenaarignoring the actual value of the 'magic' option.
414eb3593b3SBram MoolenaarUse of "\M" makes the pattern after it be interpreted as if 'nomagic' is used.
415eb3593b3SBram Moolenaar							*/\v* */\V*
416c8c88492SBram MoolenaarUse of "\v" means that after it, all ASCII characters except '0'-'9', 'a'-'z',
417c8c88492SBram Moolenaar'A'-'Z' and '_' have special meaning: "very magic"
418eb3593b3SBram Moolenaar
4197e6a515eSBram MoolenaarUse of "\V" means that after it, only a backslash and the terminating
4207e6a515eSBram Moolenaarcharacter (usually / or ?) have special meaning: "very nomagic"
421eb3593b3SBram Moolenaar
422eb3593b3SBram MoolenaarExamples:
423eb3593b3SBram Moolenaarafter:	  \v	   \m	    \M	     \V		matches ~
424eb3593b3SBram Moolenaar		'magic' 'nomagic'
4257e6a515eSBram Moolenaar	  a	   a	    a	     a		literal 'a'
4267e6a515eSBram Moolenaar	  \a	   \a	    \a	     \a		any alphabetic character
4277e6a515eSBram Moolenaar	  .	   .	    \.	     \.		any character
4287e6a515eSBram Moolenaar	  \.	   \.	    .	     .		literal dot
4297e6a515eSBram Moolenaar	  $	   $	    $	     \$		end-of-line
430eb3593b3SBram Moolenaar	  *	   *	    \*	     \*		any number of the previous atom
431256972a9SBram Moolenaar	  ~	   ~	    \~	     \~		latest substitute string
4327e6a515eSBram Moolenaar	  ()	   \(\)     \(\)     \(\)	group as an atom
4337e6a515eSBram Moolenaar	  |	   \|	    \|	     \|		nothing: separates alternatives
434eb3593b3SBram Moolenaar	  \\	   \\	    \\	     \\		literal backslash
4357e6a515eSBram Moolenaar	  \{	   {	    {	     {		literal curly brace
436eb3593b3SBram Moolenaar
437eb3593b3SBram Moolenaar{only Vim supports \m, \M, \v and \V}
438eb3593b3SBram Moolenaar
4397e6a515eSBram MoolenaarIf you want to you can make a pattern immune to the 'magic' option being set
4407e6a515eSBram Moolenaaror not by putting "\m" or "\M" at the start of the pattern.
441eb3593b3SBram Moolenaar
442eb3593b3SBram Moolenaar==============================================================================
443071d4279SBram Moolenaar4. Overview of pattern items				*pattern-overview*
444fbc0d2eaSBram Moolenaar						*E865* *E866* *E867* *E869*
445071d4279SBram Moolenaar
446071d4279SBram MoolenaarOverview of multi items.				*/multi* *E61* *E62*
447fbc0d2eaSBram MoolenaarMore explanation and examples below, follow the links.		*E64* *E871*
448071d4279SBram Moolenaar
449071d4279SBram Moolenaar	  multi ~
450071d4279SBram Moolenaar     'magic' 'nomagic'	matches of the preceding atom ~
451071d4279SBram Moolenaar|/star|	*	\*	0 or more	as many as possible
45225c9c680SBram Moolenaar|/\+|	\+	\+	1 or more	as many as possible
45325c9c680SBram Moolenaar|/\=|	\=	\=	0 or 1		as many as possible
45425c9c680SBram Moolenaar|/\?|	\?	\?	0 or 1		as many as possible
455071d4279SBram Moolenaar
45625c9c680SBram Moolenaar|/\{|	\{n,m}	\{n,m}	n to m		as many as possible
45725c9c680SBram Moolenaar	\{n}	\{n}	n		exactly
45825c9c680SBram Moolenaar	\{n,}	\{n,}	at least n	as many as possible
45925c9c680SBram Moolenaar	\{,m}	\{,m}	0 to m		as many as possible
46025c9c680SBram Moolenaar	\{}	\{}	0 or more	as many as possible (same as *)
461071d4279SBram Moolenaar
46225c9c680SBram Moolenaar|/\{-|	\{-n,m}	\{-n,m}	n to m		as few as possible
46325c9c680SBram Moolenaar	\{-n}	\{-n}	n		exactly
46425c9c680SBram Moolenaar	\{-n,}	\{-n,}	at least n	as few as possible
46525c9c680SBram Moolenaar	\{-,m}	\{-,m}	0 to m		as few as possible
46625c9c680SBram Moolenaar	\{-}	\{-}	0 or more	as few as possible
467071d4279SBram Moolenaar
468071d4279SBram Moolenaar							*E59*
46925c9c680SBram Moolenaar|/\@>|	\@>	\@>	1, like matching a whole pattern
47025c9c680SBram Moolenaar|/\@=|	\@=	\@=	nothing, requires a match |/zero-width|
47125c9c680SBram Moolenaar|/\@!|	\@!	\@!	nothing, requires NO match |/zero-width|
47225c9c680SBram Moolenaar|/\@<=|	\@<=	\@<=	nothing, requires a match behind |/zero-width|
47325c9c680SBram Moolenaar|/\@<!|	\@<!	\@<!	nothing, requires NO match behind |/zero-width|
474071d4279SBram Moolenaar
475071d4279SBram Moolenaar
476071d4279SBram MoolenaarOverview of ordinary atoms.				*/ordinary-atom*
477071d4279SBram MoolenaarMore explanation and examples below, follow the links.
478071d4279SBram Moolenaar
479071d4279SBram Moolenaar      ordinary atom ~
480071d4279SBram Moolenaar      magic   nomagic	matches ~
481071d4279SBram Moolenaar|/^|	^	^	start-of-line (at start of pattern) |/zero-width|
482071d4279SBram Moolenaar|/\^|	\^	\^	literal '^'
483071d4279SBram Moolenaar|/\_^|	\_^	\_^	start-of-line (used anywhere) |/zero-width|
484071d4279SBram Moolenaar|/$|	$	$	end-of-line (at end of pattern) |/zero-width|
485071d4279SBram Moolenaar|/\$|	\$	\$	literal '$'
486071d4279SBram Moolenaar|/\_$|	\_$	\_$	end-of-line (used anywhere) |/zero-width|
487071d4279SBram Moolenaar|/.|	.	\.	any single character (not an end-of-line)
488071d4279SBram Moolenaar|/\_.|	\_.	\_.	any single character or end-of-line
489071d4279SBram Moolenaar|/\<|	\<	\<	beginning of a word |/zero-width|
490071d4279SBram Moolenaar|/\>|	\>	\>	end of a word |/zero-width|
491071d4279SBram Moolenaar|/\zs|	\zs	\zs	anything, sets start of match
492071d4279SBram Moolenaar|/\ze|	\ze	\ze	anything, sets end of match
493071d4279SBram Moolenaar|/\%^|	\%^	\%^	beginning of file |/zero-width|		*E71*
494071d4279SBram Moolenaar|/\%$|	\%$	\%$	end of file |/zero-width|
49533aec765SBram Moolenaar|/\%V|	\%V	\%V	inside Visual area |/zero-width|
496071d4279SBram Moolenaar|/\%#|	\%#	\%#	cursor position |/zero-width|
49733aec765SBram Moolenaar|/\%'m|	\%'m	\%'m	mark m position |/zero-width|
498071d4279SBram Moolenaar|/\%l|	\%23l	\%23l	in line 23 |/zero-width|
499071d4279SBram Moolenaar|/\%c|	\%23c	\%23c	in column 23 |/zero-width|
500071d4279SBram Moolenaar|/\%v|	\%23v	\%23v	in virtual column 23 |/zero-width|
501071d4279SBram Moolenaar
50225c9c680SBram MoolenaarCharacter classes:					*/character-classes*
503256972a9SBram Moolenaar      magic   nomagic	matches ~
504071d4279SBram Moolenaar|/\i|	\i	\i	identifier character (see 'isident' option)
505071d4279SBram Moolenaar|/\I|	\I	\I	like "\i", but excluding digits
506071d4279SBram Moolenaar|/\k|	\k	\k	keyword character (see 'iskeyword' option)
507071d4279SBram Moolenaar|/\K|	\K	\K	like "\k", but excluding digits
508071d4279SBram Moolenaar|/\f|	\f	\f	file name character (see 'isfname' option)
509071d4279SBram Moolenaar|/\F|	\F	\F	like "\f", but excluding digits
510071d4279SBram Moolenaar|/\p|	\p	\p	printable character (see 'isprint' option)
511071d4279SBram Moolenaar|/\P|	\P	\P	like "\p", but excluding digits
512071d4279SBram Moolenaar|/\s|	\s	\s	whitespace character: <Space> and <Tab>
513071d4279SBram Moolenaar|/\S|	\S	\S	non-whitespace character; opposite of \s
514071d4279SBram Moolenaar|/\d|	\d	\d	digit:				[0-9]
515071d4279SBram Moolenaar|/\D|	\D	\D	non-digit:			[^0-9]
516071d4279SBram Moolenaar|/\x|	\x	\x	hex digit:			[0-9A-Fa-f]
517071d4279SBram Moolenaar|/\X|	\X	\X	non-hex digit:			[^0-9A-Fa-f]
518071d4279SBram Moolenaar|/\o|	\o	\o	octal digit:			[0-7]
519071d4279SBram Moolenaar|/\O|	\O	\O	non-octal digit:		[^0-7]
520071d4279SBram Moolenaar|/\w|	\w	\w	word character:			[0-9A-Za-z_]
521071d4279SBram Moolenaar|/\W|	\W	\W	non-word character:		[^0-9A-Za-z_]
522071d4279SBram Moolenaar|/\h|	\h	\h	head of word character:		[A-Za-z_]
523071d4279SBram Moolenaar|/\H|	\H	\H	non-head of word character:	[^A-Za-z_]
524071d4279SBram Moolenaar|/\a|	\a	\a	alphabetic character:		[A-Za-z]
525071d4279SBram Moolenaar|/\A|	\A	\A	non-alphabetic character:	[^A-Za-z]
526071d4279SBram Moolenaar|/\l|	\l	\l	lowercase character:		[a-z]
527071d4279SBram Moolenaar|/\L|	\L	\L	non-lowercase character:	[^a-z]
528071d4279SBram Moolenaar|/\u|	\u	\u	uppercase character:		[A-Z]
529071d4279SBram Moolenaar|/\U|	\U	\U	non-uppercase character		[^A-Z]
530071d4279SBram Moolenaar|/\_|	\_x	\_x	where x is any of the characters above: character
531071d4279SBram Moolenaar			class with end-of-line included
532071d4279SBram Moolenaar(end of character classes)
533071d4279SBram Moolenaar
534256972a9SBram Moolenaar      magic   nomagic	matches ~
535071d4279SBram Moolenaar|/\e|	\e	\e	<Esc>
536071d4279SBram Moolenaar|/\t|	\t	\t	<Tab>
537071d4279SBram Moolenaar|/\r|	\r	\r	<CR>
538071d4279SBram Moolenaar|/\b|	\b	\b	<BS>
539071d4279SBram Moolenaar|/\n|	\n	\n	end-of-line
540071d4279SBram Moolenaar|/~|	~	\~	last given substitute string
54125c9c680SBram Moolenaar|/\1|	\1	\1	same string as matched by first \(\)
542071d4279SBram Moolenaar|/\2|	\2	\2	Like "\1", but uses second \(\)
543071d4279SBram Moolenaar	   ...
544071d4279SBram Moolenaar|/\9|	\9	\9	Like "\1", but uses ninth \(\)
545071d4279SBram Moolenaar								*E68*
546071d4279SBram Moolenaar|/\z1|	\z1	\z1	only for syntax highlighting, see |:syn-ext-match|
547071d4279SBram Moolenaar	   ...
548071d4279SBram Moolenaar|/\z1|	\z9	\z9	only for syntax highlighting, see |:syn-ext-match|
549071d4279SBram Moolenaar
550071d4279SBram Moolenaar	x	x	a character with no special meaning matches itself
551071d4279SBram Moolenaar
552071d4279SBram Moolenaar|/[]|	[]	\[]	any character specified inside the []
553c0197e28SBram Moolenaar|/\%[]|	\%[]	\%[]	a sequence of optionally matched atoms
554071d4279SBram Moolenaar
5553577c6faSBram Moolenaar|/\c|	\c	\c	ignore case, do not use the 'ignorecase' option
5563577c6faSBram Moolenaar|/\C|	\C	\C	match case, do not use the 'ignorecase' option
557fbc0d2eaSBram Moolenaar|/\Z|	\Z	\Z	ignore differences in Unicode "combining characters".
558fbc0d2eaSBram Moolenaar			Useful when searching voweled Hebrew or Arabic text.
559fbc0d2eaSBram Moolenaar
560256972a9SBram Moolenaar      magic   nomagic	matches ~
561071d4279SBram Moolenaar|/\m|	\m	\m	'magic' on for the following chars in the pattern
562071d4279SBram Moolenaar|/\M|	\M	\M	'magic' off for the following chars in the pattern
563071d4279SBram Moolenaar|/\v|	\v	\v	the following chars in the pattern are "very magic"
564071d4279SBram Moolenaar|/\V|	\V	\V	the following chars in the pattern are "very nomagic"
565fbc0d2eaSBram Moolenaar|/\%#=|   \%#=1   \%#=1   select regexp engine |/zero-width|
566071d4279SBram Moolenaar
5678f3f58f2SBram Moolenaar|/\%d|	\%d	\%d	match specified decimal character (eg \%d123)
568c0197e28SBram Moolenaar|/\%x|	\%x	\%x	match specified hex character (eg \%x2a)
569c0197e28SBram Moolenaar|/\%o|	\%o	\%o	match specified octal character (eg \%o040)
570c0197e28SBram Moolenaar|/\%u|	\%u	\%u	match specified multibyte character (eg \%u20ac)
571c0197e28SBram Moolenaar|/\%U|	\%U	\%U	match specified large multibyte character (eg
572c0197e28SBram Moolenaar			\%U12345678)
5738df5acfdSBram Moolenaar|/\%C|	\%C	\%C	match any composing characters
574071d4279SBram Moolenaar
575071d4279SBram MoolenaarExample			matches ~
576071d4279SBram Moolenaar\<\I\i*		or
577071d4279SBram Moolenaar\<\h\w*
578071d4279SBram Moolenaar\<[a-zA-Z_][a-zA-Z0-9_]*
579071d4279SBram Moolenaar			An identifier (e.g., in a C program).
580071d4279SBram Moolenaar
581071d4279SBram Moolenaar\(\.$\|\. \)		A period followed by <EOL> or a space.
582071d4279SBram Moolenaar
583071d4279SBram Moolenaar[.!?][])"']*\($\|[ ]\)	A search pattern that finds the end of a sentence,
584071d4279SBram Moolenaar			with almost the same definition as the ")" command.
585071d4279SBram Moolenaar
586071d4279SBram Moolenaarcat\Z			Both "cat" and "càt" ("a" followed by 0x0300)
587071d4279SBram Moolenaar			Does not match "càt" (character 0x00e0), even
588071d4279SBram Moolenaar			though it may look the same.
589071d4279SBram Moolenaar
590071d4279SBram Moolenaar
591071d4279SBram Moolenaar==============================================================================
592071d4279SBram Moolenaar5. Multi items						*pattern-multi-items*
593071d4279SBram Moolenaar
594071d4279SBram MoolenaarAn atom can be followed by an indication of how many times the atom can be
595071d4279SBram Moolenaarmatched and in what way.  This is called a multi.  See |/multi| for an
596071d4279SBram Moolenaaroverview.
597071d4279SBram Moolenaar
598aa3b15dbSBram Moolenaar							*/star* */\star*
599071d4279SBram Moolenaar*	(use \* when 'magic' is not set)
600071d4279SBram Moolenaar	Matches 0 or more of the preceding atom, as many as possible.
601071d4279SBram Moolenaar	Example  'nomagic'	matches ~
602071d4279SBram Moolenaar	a*	   a\*		"", "a", "aa", "aaa", etc.
603071d4279SBram Moolenaar	.*	   \.\*		anything, also an empty string, no end-of-line
604071d4279SBram Moolenaar	\_.*	   \_.\*	everything up to the end of the buffer
605071d4279SBram Moolenaar	\_.*END	   \_.\*END	everything up to and including the last "END"
606071d4279SBram Moolenaar				in the buffer
607071d4279SBram Moolenaar
608071d4279SBram Moolenaar	Exception: When "*" is used at the start of the pattern or just after
609071d4279SBram Moolenaar	"^" it matches the star character.
610071d4279SBram Moolenaar
611071d4279SBram Moolenaar	Be aware that repeating "\_." can match a lot of text and take a long
612071d4279SBram Moolenaar	time.  For example, "\_.*END" matches all text from the current
613071d4279SBram Moolenaar	position to the last occurrence of "END" in the file.  Since the "*"
614071d4279SBram Moolenaar	will match as many as possible, this first skips over all lines until
615071d4279SBram Moolenaar	the end of the file and then tries matching "END", backing up one
616071d4279SBram Moolenaar	character at a time.
617071d4279SBram Moolenaar
618aa3b15dbSBram Moolenaar							*/\+*
61925c9c680SBram Moolenaar\+	Matches 1 or more of the preceding atom, as many as possible.
620071d4279SBram Moolenaar	Example		matches ~
621071d4279SBram Moolenaar	^.\+$		any non-empty line
622071d4279SBram Moolenaar	\s\+		white space of at least one character
623071d4279SBram Moolenaar
624071d4279SBram Moolenaar							*/\=*
62525c9c680SBram Moolenaar\=	Matches 0 or 1 of the preceding atom, as many as possible.
626071d4279SBram Moolenaar	Example		matches ~
627071d4279SBram Moolenaar	foo\=		"fo" and "foo"
628071d4279SBram Moolenaar
629071d4279SBram Moolenaar							*/\?*
630071d4279SBram Moolenaar\?	Just like \=.  Cannot be used when searching backwards with the "?"
63125c9c680SBram Moolenaar	command.
632071d4279SBram Moolenaar
633aa3b15dbSBram Moolenaar					*/\{* *E60* *E554* *E870*
634071d4279SBram Moolenaar\{n,m}	Matches n to m of the preceding atom, as many as possible
635071d4279SBram Moolenaar\{n}	Matches n of the preceding atom
636071d4279SBram Moolenaar\{n,}	Matches at least n of the preceding atom, as many as possible
637071d4279SBram Moolenaar\{,m}	Matches 0 to m of the preceding atom, as many as possible
638071d4279SBram Moolenaar\{}	Matches 0 or more of the preceding atom, as many as possible (like *)
639071d4279SBram Moolenaar							*/\{-*
640071d4279SBram Moolenaar\{-n,m}	matches n to m of the preceding atom, as few as possible
641071d4279SBram Moolenaar\{-n}	matches n of the preceding atom
642071d4279SBram Moolenaar\{-n,}	matches at least n of the preceding atom, as few as possible
643071d4279SBram Moolenaar\{-,m}	matches 0 to m of the preceding atom, as few as possible
644071d4279SBram Moolenaar\{-}	matches 0 or more of the preceding atom, as few as possible
645071d4279SBram Moolenaar
64626a60b45SBram Moolenaar	n and m are positive decimal numbers or zero
647c81e5e79SBram Moolenaar								*non-greedy*
648071d4279SBram Moolenaar	If a "-" appears immediately after the "{", then a shortest match
649071d4279SBram Moolenaar	first algorithm is used (see example below).  In particular, "\{-}" is
650071d4279SBram Moolenaar	the same as "*" but uses the shortest match first algorithm.  BUT: A
651071d4279SBram Moolenaar	match that starts earlier is preferred over a shorter match: "a\{-}b"
652071d4279SBram Moolenaar	matches "aaab" in "xaaab".
653071d4279SBram Moolenaar
654071d4279SBram Moolenaar	Example			matches ~
655071d4279SBram Moolenaar	ab\{2,3}c		"abbc" or "abbbc"
6563577c6faSBram Moolenaar	a\{5}			"aaaaa"
6573577c6faSBram Moolenaar	ab\{2,}c		"abbc", "abbbc", "abbbbc", etc.
6583577c6faSBram Moolenaar	ab\{,3}c		"ac", "abc", "abbc" or "abbbc"
659071d4279SBram Moolenaar	a[bc]\{3}d		"abbbd", "abbcd", "acbcd", "acccd", etc.
660071d4279SBram Moolenaar	a\(bc\)\{1,2}d		"abcd" or "abcbcd"
661071d4279SBram Moolenaar	a[bc]\{-}[cd]		"abc" in "abcd"
662071d4279SBram Moolenaar	a[bc]*[cd]		"abcd" in "abcd"
663071d4279SBram Moolenaar
664071d4279SBram Moolenaar	The } may optionally be preceded with a backslash: \{n,m\}.
665071d4279SBram Moolenaar
666071d4279SBram Moolenaar							*/\@=*
66725c9c680SBram Moolenaar\@=	Matches the preceding atom with zero width.
668071d4279SBram Moolenaar	Like "(?=pattern)" in Perl.
669071d4279SBram Moolenaar	Example			matches ~
670071d4279SBram Moolenaar	foo\(bar\)\@=		"foo" in "foobar"
671071d4279SBram Moolenaar	foo\(bar\)\@=foo	nothing
672071d4279SBram Moolenaar							*/zero-width*
673071d4279SBram Moolenaar	When using "\@=" (or "^", "$", "\<", "\>") no characters are included
674071d4279SBram Moolenaar	in the match.  These items are only used to check if a match can be
675071d4279SBram Moolenaar	made.  This can be tricky, because a match with following items will
676071d4279SBram Moolenaar	be done in the same position.  The last example above will not match
677071d4279SBram Moolenaar	"foobarfoo", because it tries match "foo" in the same position where
678071d4279SBram Moolenaar	"bar" matched.
679071d4279SBram Moolenaar
680071d4279SBram Moolenaar	Note that using "\&" works the same as using "\@=": "foo\&.." is the
681071d4279SBram Moolenaar	same as "\(foo\)\@=..".  But using "\&" is easier, you don't need the
6821b884a00SBram Moolenaar	parentheses.
683071d4279SBram Moolenaar
684071d4279SBram Moolenaar
685071d4279SBram Moolenaar							*/\@!*
686071d4279SBram Moolenaar\@!	Matches with zero width if the preceding atom does NOT match at the
68725c9c680SBram Moolenaar	current position. |/zero-width|
6881aeaf8c0SBram Moolenaar	Like "(?!pattern)" in Perl.
689071d4279SBram Moolenaar	Example			matches ~
690071d4279SBram Moolenaar	foo\(bar\)\@!		any "foo" not followed by "bar"
6911aeaf8c0SBram Moolenaar	a.\{-}p\@!		"a", "ap", "app", "appp", etc. not immediately
692251e1912SBram Moolenaar				followed by a "p"
693071d4279SBram Moolenaar	if \(\(then\)\@!.\)*$	"if " not followed by "then"
694071d4279SBram Moolenaar
695071d4279SBram Moolenaar	Using "\@!" is tricky, because there are many places where a pattern
696071d4279SBram Moolenaar	does not match.  "a.*p\@!" will match from an "a" to the end of the
697071d4279SBram Moolenaar	line, because ".*" can match all characters in the line and the "p"
698071d4279SBram Moolenaar	doesn't match at the end of the line.  "a.\{-}p\@!" will match any
6991aeaf8c0SBram Moolenaar	"a", "ap", "app", etc. that isn't followed by a "p", because the "."
700071d4279SBram Moolenaar	can match a "p" and "p\@!" doesn't match after that.
701071d4279SBram Moolenaar
702071d4279SBram Moolenaar	You can't use "\@!" to look for a non-match before the matching
703071d4279SBram Moolenaar	position: "\(foo\)\@!bar" will match "bar" in "foobar", because at the
704071d4279SBram Moolenaar	position where "bar" matches, "foo" does not match.  To avoid matching
705071d4279SBram Moolenaar	"foobar" you could use "\(foo\)\@!...bar", but that doesn't match a
706071d4279SBram Moolenaar	bar at the start of a line.  Use "\(foo\)\@<!bar".
707071d4279SBram Moolenaar
7088e5af3e5SBram Moolenaar	Useful example: to find "foo" in a line that does not contain "bar": >
7098e5af3e5SBram Moolenaar		/^\%(.*bar\)\@!.*\zsfoo
7108e5af3e5SBram Moolenaar<	This pattern first checks that there is not a single position in the
7118e5af3e5SBram Moolenaar	line where "bar" matches.  If ".*bar" matches somewhere the \@! will
7128e5af3e5SBram Moolenaar	reject the pattern.  When there is no match any "foo" will be found.
7138e5af3e5SBram Moolenaar	The "\zs" is to have the match start just before "foo".
7148e5af3e5SBram Moolenaar
715071d4279SBram Moolenaar							*/\@<=*
716071d4279SBram Moolenaar\@<=	Matches with zero width if the preceding atom matches just before what
71725c9c680SBram Moolenaar	follows. |/zero-width|
7181aeaf8c0SBram Moolenaar	Like "(?<=pattern)" in Perl, but Vim allows non-fixed-width patterns.
719071d4279SBram Moolenaar	Example			matches ~
720071d4279SBram Moolenaar	\(an\_s\+\)\@<=file	"file" after "an" and white space or an
721071d4279SBram Moolenaar				end-of-line
722071d4279SBram Moolenaar	For speed it's often much better to avoid this multi.  Try using "\zs"
723071d4279SBram Moolenaar	instead |/\zs|.  To match the same as the above example:
724071d4279SBram Moolenaar		an\_s\+\zsfile
725543b7ef7SBram Moolenaar	At least set a limit for the look-behind, see below.
726071d4279SBram Moolenaar
727071d4279SBram Moolenaar	"\@<=" and "\@<!" check for matches just before what follows.
728071d4279SBram Moolenaar	Theoretically these matches could start anywhere before this position.
729071d4279SBram Moolenaar	But to limit the time needed, only the line where what follows matches
730071d4279SBram Moolenaar	is searched, and one line before that (if there is one).  This should
731071d4279SBram Moolenaar	be sufficient to match most things and not be too slow.
732fb539273SBram Moolenaar
733fb539273SBram Moolenaar	In the old regexp engine the part of the pattern after "\@<=" and
734fb539273SBram Moolenaar	"\@<!" are checked for a match first, thus things like "\1" don't work
735fb539273SBram Moolenaar	to reference \(\) inside the preceding atom.  It does work the other
736fb539273SBram Moolenaar	way around:
737fb539273SBram Moolenaar	Bad example			matches ~
738fb539273SBram Moolenaar	\%#=1\1\@<=,\([a-z]\+\)		",abc" in "abc,abc"
739fb539273SBram Moolenaar
740fb539273SBram Moolenaar	However, the new regexp engine works differently, it is better to not
741fb539273SBram Moolenaar	rely on this behavior, do not use \@<= if it can be avoided:
742071d4279SBram Moolenaar	Example				matches ~
743fb539273SBram Moolenaar	\([a-z]\+\)\zs,\1		",abc" in "abc,abc"
744071d4279SBram Moolenaar
745543b7ef7SBram Moolenaar\@123<=
746543b7ef7SBram Moolenaar	Like "\@<=" but only look back 123 bytes. This avoids trying lots
747543b7ef7SBram Moolenaar	of matches that are known to fail and make executing the pattern very
748543b7ef7SBram Moolenaar	slow.  Example, check if there is a "<" just before "span":
749543b7ef7SBram Moolenaar		/<\@1<=span
750543b7ef7SBram Moolenaar	This will try matching "<" only one byte before "span", which is the
751543b7ef7SBram Moolenaar	only place that works anyway.
752543b7ef7SBram Moolenaar	After crossing a line boundary, the limit is relative to the end of
753543b7ef7SBram Moolenaar	the line.  Thus the characters at the start of the line with the match
754543b7ef7SBram Moolenaar	are not counted (this is just to keep it simple).
755543b7ef7SBram Moolenaar	The number zero is the same as no limit.
756543b7ef7SBram Moolenaar
757071d4279SBram Moolenaar							*/\@<!*
758071d4279SBram Moolenaar\@<!	Matches with zero width if the preceding atom does NOT match just
759071d4279SBram Moolenaar	before what follows.  Thus this matches if there is no position in the
760071d4279SBram Moolenaar	current or previous line where the atom matches such that it ends just
76125c9c680SBram Moolenaar	before what follows.  |/zero-width|
7621aeaf8c0SBram Moolenaar	Like "(?<!pattern)" in Perl, but Vim allows non-fixed-width patterns.
763071d4279SBram Moolenaar	The match with the preceding atom is made to end just before the match
764071d4279SBram Moolenaar	with what follows, thus an atom that ends in ".*" will work.
765071d4279SBram Moolenaar	Warning: This can be slow (because many positions need to be checked
766543b7ef7SBram Moolenaar	for a match).  Use a limit if you can, see below.
767071d4279SBram Moolenaar	Example			matches ~
768071d4279SBram Moolenaar	\(foo\)\@<!bar		any "bar" that's not in "foobar"
7693577c6faSBram Moolenaar	\(\/\/.*\)\@<!in	"in" which is not after "//"
770071d4279SBram Moolenaar
771543b7ef7SBram Moolenaar\@123<!
772543b7ef7SBram Moolenaar	Like "\@<!" but only look back 123 bytes. This avoids trying lots of
773543b7ef7SBram Moolenaar	matches that are known to fail and make executing the pattern very
774543b7ef7SBram Moolenaar	slow.
775543b7ef7SBram Moolenaar
776071d4279SBram Moolenaar							*/\@>*
77725c9c680SBram Moolenaar\@>	Matches the preceding atom like matching a whole pattern.
7783577c6faSBram Moolenaar	Like "(?>pattern)" in Perl.
779071d4279SBram Moolenaar	Example		matches ~
780071d4279SBram Moolenaar	\(a*\)\@>a	nothing (the "a*" takes all the "a"'s, there can't be
781071d4279SBram Moolenaar			another one following)
782071d4279SBram Moolenaar
783071d4279SBram Moolenaar	This matches the preceding atom as if it was a pattern by itself.  If
784071d4279SBram Moolenaar	it doesn't match, there is no retry with shorter sub-matches or
785071d4279SBram Moolenaar	anything.  Observe this difference: "a*b" and "a*ab" both match
786071d4279SBram Moolenaar	"aaab", but in the second case the "a*" matches only the first two
787071d4279SBram Moolenaar	"a"s.  "\(a*\)\@>ab" will not match "aaab", because the "a*" matches
788071d4279SBram Moolenaar	the "aaa" (as many "a"s as possible), thus the "ab" can't match.
789071d4279SBram Moolenaar
790071d4279SBram Moolenaar
791071d4279SBram Moolenaar==============================================================================
792071d4279SBram Moolenaar6.  Ordinary atoms					*pattern-atoms*
793071d4279SBram Moolenaar
794071d4279SBram MoolenaarAn ordinary atom can be:
795071d4279SBram Moolenaar
796071d4279SBram Moolenaar							*/^*
797071d4279SBram Moolenaar^	At beginning of pattern or after "\|", "\(", "\%(" or "\n": matches
798071d4279SBram Moolenaar	start-of-line; at other positions, matches literal '^'. |/zero-width|
799071d4279SBram Moolenaar	Example		matches ~
800071d4279SBram Moolenaar	^beep(		the start of the C function "beep" (probably).
801071d4279SBram Moolenaar
802071d4279SBram Moolenaar							*/\^*
8031c6737b2SBram Moolenaar\^	Matches literal '^'.  Can be used at any position in the pattern, but
8041c6737b2SBram Moolenaar	not inside [].
805071d4279SBram Moolenaar
806071d4279SBram Moolenaar							*/\_^*
807071d4279SBram Moolenaar\_^	Matches start-of-line. |/zero-width|  Can be used at any position in
8081c6737b2SBram Moolenaar	the pattern, but not inside [].
809071d4279SBram Moolenaar	Example		matches ~
810071d4279SBram Moolenaar	\_s*\_^foo	white space and blank lines and then "foo" at
811071d4279SBram Moolenaar			start-of-line
812071d4279SBram Moolenaar
813071d4279SBram Moolenaar							*/$*
8143577c6faSBram Moolenaar$	At end of pattern or in front of "\|", "\)" or "\n" ('magic' on):
815071d4279SBram Moolenaar	matches end-of-line <EOL>; at other positions, matches literal '$'.
816071d4279SBram Moolenaar	|/zero-width|
817071d4279SBram Moolenaar
818071d4279SBram Moolenaar							*/\$*
8191c6737b2SBram Moolenaar\$	Matches literal '$'.  Can be used at any position in the pattern, but
8201c6737b2SBram Moolenaar	not inside [].
821071d4279SBram Moolenaar
822071d4279SBram Moolenaar							*/\_$*
823071d4279SBram Moolenaar\_$	Matches end-of-line. |/zero-width|  Can be used at any position in the
8241c6737b2SBram Moolenaar	pattern, but not inside [].  Note that "a\_$b" never matches, since
8251c6737b2SBram Moolenaar	"b" cannot match an end-of-line.  Use "a\nb" instead |/\n|.
826071d4279SBram Moolenaar	Example		matches ~
827071d4279SBram Moolenaar	foo\_$\_s*	"foo" at end-of-line and following white space and
828071d4279SBram Moolenaar			blank lines
829071d4279SBram Moolenaar
830071d4279SBram Moolenaar.	(with 'nomagic': \.)				*/.* */\.*
831071d4279SBram Moolenaar	Matches any single character, but not an end-of-line.
832071d4279SBram Moolenaar
833071d4279SBram Moolenaar							*/\_.*
834071d4279SBram Moolenaar\_.	Matches any single character or end-of-line.
835071d4279SBram Moolenaar	Careful: "\_.*" matches all text to the end of the buffer!
836071d4279SBram Moolenaar
837071d4279SBram Moolenaar							*/\<*
838071d4279SBram Moolenaar\<	Matches the beginning of a word: The next char is the first char of a
839071d4279SBram Moolenaar	word.  The 'iskeyword' option specifies what is a word character.
840071d4279SBram Moolenaar	|/zero-width|
841071d4279SBram Moolenaar
842071d4279SBram Moolenaar							*/\>*
843071d4279SBram Moolenaar\>	Matches the end of a word: The previous char is the last char of a
844071d4279SBram Moolenaar	word.  The 'iskeyword' option specifies what is a word character.
845071d4279SBram Moolenaar	|/zero-width|
846071d4279SBram Moolenaar
847071d4279SBram Moolenaar							*/\zs*
8481c6737b2SBram Moolenaar\zs	Matches at any position, but not inside [], and sets the start of the
8491c6737b2SBram Moolenaar	match there: The next char is the first char of the whole match.
8501c6737b2SBram Moolenaar	|/zero-width|
851071d4279SBram Moolenaar	Example: >
852071d4279SBram Moolenaar		/^\s*\zsif
853071d4279SBram Moolenaar<	matches an "if" at the start of a line, ignoring white space.
854071d4279SBram Moolenaar	Can be used multiple times, the last one encountered in a matching
855071d4279SBram Moolenaar	branch is used.  Example: >
856071d4279SBram Moolenaar		/\(.\{-}\zsFab\)\{3}
857071d4279SBram Moolenaar<	Finds the third occurrence of "Fab".
85834401ccaSBram Moolenaar	This cannot be followed by a multi. *E888*
85925c9c680SBram Moolenaar	{not available when compiled without the |+syntax| feature}
860071d4279SBram Moolenaar							*/\ze*
8611c6737b2SBram Moolenaar\ze	Matches at any position, but not inside [], and sets the end of the
8621c6737b2SBram Moolenaar	match there: The previous char is the last char of the whole match.
8631c6737b2SBram Moolenaar	|/zero-width|
864071d4279SBram Moolenaar	Can be used multiple times, the last one encountered in a matching
865071d4279SBram Moolenaar	branch is used.
866071d4279SBram Moolenaar	Example: "end\ze\(if\|for\)" matches the "end" in "endif" and
867071d4279SBram Moolenaar	"endfor".
8686e932461SBram Moolenaar	This cannot be followed by a multi. |E888|
86925c9c680SBram Moolenaar	{not available when compiled without the |+syntax| feature}
870071d4279SBram Moolenaar
871071d4279SBram Moolenaar						*/\%^* *start-of-file*
872071d4279SBram Moolenaar\%^	Matches start of the file.  When matching with a string, matches the
87325c9c680SBram Moolenaar	start of the string.
874071d4279SBram Moolenaar	For example, to find the first "VIM" in a file: >
875071d4279SBram Moolenaar		/\%^\_.\{-}\zsVIM
876071d4279SBram Moolenaar<
877071d4279SBram Moolenaar						*/\%$* *end-of-file*
878071d4279SBram Moolenaar\%$	Matches end of the file.  When matching with a string, matches the
87925c9c680SBram Moolenaar	end of the string.
880071d4279SBram Moolenaar	Note that this does NOT find the last "VIM" in a file: >
881071d4279SBram Moolenaar		/VIM\_.\{-}\%$
882071d4279SBram Moolenaar<	It will find the next VIM, because the part after it will always
883071d4279SBram Moolenaar	match.  This one will find the last "VIM" in the file: >
884071d4279SBram Moolenaar		/VIM\ze\(\(VIM\)\@!\_.\)*\%$
885071d4279SBram Moolenaar<	This uses |/\@!| to ascertain that "VIM" does NOT match in any
886071d4279SBram Moolenaar	position after the first "VIM".
887071d4279SBram Moolenaar	Searching from the end of the file backwards is easier!
888071d4279SBram Moolenaar
88933aec765SBram Moolenaar						*/\%V*
89033aec765SBram Moolenaar\%V	Match inside the Visual area.  When Visual mode has already been
89133aec765SBram Moolenaar	stopped match in the area that |gv| would reselect.
8928f3f58f2SBram Moolenaar	This is a |/zero-width| match.  To make sure the whole pattern is
893214641f7SBram Moolenaar	inside the Visual area put it at the start and just before the end of
894214641f7SBram Moolenaar	the pattern, e.g.: >
895214641f7SBram Moolenaar		/\%Vfoo.*ba\%Vr
896036986f1SBram Moolenaar<	This also works if only "foo bar" was Visually selected. This: >
897036986f1SBram Moolenaar		/\%Vfoo.*bar\%V
898214641f7SBram Moolenaar<	would match "foo bar" if the Visual selection continues after the "r".
899214641f7SBram Moolenaar	Only works for the current buffer.
90033aec765SBram Moolenaar
901071d4279SBram Moolenaar						*/\%#* *cursor-position*
902071d4279SBram Moolenaar\%#	Matches with the cursor position.  Only works when matching in a
90325c9c680SBram Moolenaar	buffer displayed in a window.
904071d4279SBram Moolenaar	WARNING: When the cursor is moved after the pattern was used, the
905071d4279SBram Moolenaar	result becomes invalid.  Vim doesn't automatically update the matches.
906071d4279SBram Moolenaar	This is especially relevant for syntax highlighting and 'hlsearch'.
907071d4279SBram Moolenaar	In other words: When the cursor moves the display isn't updated for
908071d4279SBram Moolenaar	this change.  An update is done for lines which are changed (the whole
909071d4279SBram Moolenaar	line is updated) or when using the |CTRL-L| command (the whole screen
910071d4279SBram Moolenaar	is updated).  Example, to highlight the word under the cursor: >
911071d4279SBram Moolenaar		/\k*\%#\k*
912071d4279SBram Moolenaar<	When 'hlsearch' is set and you move the cursor around and make changes
913071d4279SBram Moolenaar	this will clearly show when the match is updated or not.
914071d4279SBram Moolenaar
91533aec765SBram Moolenaar						*/\%'m* */\%<'m* */\%>'m*
91633aec765SBram Moolenaar\%'m	Matches with the position of mark m.
91733aec765SBram Moolenaar\%<'m	Matches before the position of mark m.
91833aec765SBram Moolenaar\%>'m	Matches after the position of mark m.
91933aec765SBram Moolenaar	Example, to highlight the text from mark 's to 'e: >
92033aec765SBram Moolenaar		/.\%>'s.*\%<'e..
92133aec765SBram Moolenaar<	Note that two dots are required to include mark 'e in the match.  That
92233aec765SBram Moolenaar	is because "\%<'e" matches at the character before the 'e mark, and
92333aec765SBram Moolenaar	since it's a |/zero-width| match it doesn't include that character.
92433aec765SBram Moolenaar	WARNING: When the mark is moved after the pattern was used, the result
92533aec765SBram Moolenaar	becomes invalid.  Vim doesn't automatically update the matches.
9261ef15e30SBram Moolenaar	Similar to moving the cursor for "\%#" |/\%#|.
92733aec765SBram Moolenaar
9287254067eSBram Moolenaar						*/\%l* */\%>l* */\%<l* *E951*
929071d4279SBram Moolenaar\%23l	Matches in a specific line.
9304770d09aSBram Moolenaar\%<23l	Matches above a specific line (lower line number).
9314770d09aSBram Moolenaar\%>23l	Matches below a specific line (higher line number).
93204db26b3SBram Moolenaar\%.l    Matches at the cursor line.
93304db26b3SBram Moolenaar\%<.l   Matches above the cursor line.
93404db26b3SBram Moolenaar\%>.l   Matches below the cursor line.
935*2286304cSBram Moolenaar	These six can be used to match specific lines in a buffer.  The "23"
93625c9c680SBram Moolenaar	can be any line number.  The first line is 1.
937071d4279SBram Moolenaar	WARNING: When inserting or deleting lines Vim does not automatically
938071d4279SBram Moolenaar	update the matches.  This means Syntax highlighting quickly becomes
93953f7fcccSBram Moolenaar	wrong.  Also when referring to the cursor position (".") and
94004db26b3SBram Moolenaar	the cursor moves the display isn't updated for this change.  An update
94104db26b3SBram Moolenaar	is done when using the |CTRL-L| command (the whole screen is updated).
942071d4279SBram Moolenaar	Example, to highlight the line where the cursor currently is: >
94304db26b3SBram Moolenaar		:exe '/\%' . line(".") . 'l'
94404db26b3SBram Moolenaar<	Alternatively use: >
94504db26b3SBram Moolenaar		/\%.l
946071d4279SBram Moolenaar<	When 'hlsearch' is set and you move the cursor around and make changes
947071d4279SBram Moolenaar	this will clearly show when the match is updated or not.
948071d4279SBram Moolenaar
949071d4279SBram Moolenaar						*/\%c* */\%>c* */\%<c*
950071d4279SBram Moolenaar\%23c	Matches in a specific column.
951071d4279SBram Moolenaar\%<23c	Matches before a specific column.
952071d4279SBram Moolenaar\%>23c	Matches after a specific column.
95304db26b3SBram Moolenaar\%.c    Matches at the cursor column.
95404db26b3SBram Moolenaar\%<.c   Matches before the cursor column.
95504db26b3SBram Moolenaar\%>.c   Matches after the cursor column.
956*2286304cSBram Moolenaar	These six can be used to match specific columns in a buffer or string.
957*2286304cSBram Moolenaar	The "23" can be any column number.  The first column is 1.  Actually,
958*2286304cSBram Moolenaar	the column is the byte number (thus it's not exactly right for
959*2286304cSBram Moolenaar	multibyte characters).
960071d4279SBram Moolenaar	WARNING: When inserting or deleting text Vim does not automatically
961071d4279SBram Moolenaar	update the matches.  This means Syntax highlighting quickly becomes
96253f7fcccSBram Moolenaar	wrong.  Also when referring to the cursor position (".") and
96304db26b3SBram Moolenaar	the cursor moves the display isn't updated for this change.  An update
96404db26b3SBram Moolenaar	is done when using the |CTRL-L| command (the whole screen is updated).
965071d4279SBram Moolenaar	Example, to highlight the column where the cursor currently is: >
966071d4279SBram Moolenaar		:exe '/\%' . col(".") . 'c'
96704db26b3SBram Moolenaar<	Alternatively use: >
96804db26b3SBram Moolenaar		/\%.c
969071d4279SBram Moolenaar<	When 'hlsearch' is set and you move the cursor around and make changes
970071d4279SBram Moolenaar	this will clearly show when the match is updated or not.
971071d4279SBram Moolenaar	Example for matching a single byte in column 44: >
972071d4279SBram Moolenaar		/\%>43c.\%<46c
973071d4279SBram Moolenaar<	Note that "\%<46c" matches in column 45 when the "." matches a byte in
974071d4279SBram Moolenaar	column 44.
975071d4279SBram Moolenaar						*/\%v* */\%>v* */\%<v*
976071d4279SBram Moolenaar\%23v	Matches in a specific virtual column.
977071d4279SBram Moolenaar\%<23v	Matches before a specific virtual column.
978071d4279SBram Moolenaar\%>23v	Matches after a specific virtual column.
97904db26b3SBram Moolenaar\%.v    Matches at the current virtual column.
98004db26b3SBram Moolenaar\%<.v   Matches before the current virtual column.
98104db26b3SBram Moolenaar\%>.v   Matches after the current virtual column.
982*2286304cSBram Moolenaar	These six can be used to match specific virtual columns in a buffer or
983*2286304cSBram Moolenaar	string.  When not matching with a buffer in a window, the option
984071d4279SBram Moolenaar	values of the current window are used (e.g., 'tabstop').
985071d4279SBram Moolenaar	The "23" can be any column number.  The first column is 1.
986071d4279SBram Moolenaar	Note that some virtual column positions will never match, because they
98769c2f17eSBram Moolenaar	are halfway through a tab or other character that occupies more than
98825c9c680SBram Moolenaar	one screen character.
989071d4279SBram Moolenaar	WARNING: When inserting or deleting text Vim does not automatically
990de934d77SBram Moolenaar	update highlighted matches.  This means Syntax highlighting quickly
99153f7fcccSBram Moolenaar	becomes wrong.  Also when referring to the cursor position (".") and
99204db26b3SBram Moolenaar	the cursor moves the display isn't updated for this change.  An update
99304db26b3SBram Moolenaar	is done when using the |CTRL-L| command (the whole screen is updated).
9943577c6faSBram Moolenaar	Example, to highlight all the characters after virtual column 72: >
995071d4279SBram Moolenaar		/\%>72v.*
996071d4279SBram Moolenaar<	When 'hlsearch' is set and you move the cursor around and make changes
997071d4279SBram Moolenaar	this will clearly show when the match is updated or not.
998071d4279SBram Moolenaar	To match the text up to column 17: >
999c95a302aSBram Moolenaar		/^.*\%17v
100004db26b3SBram Moolenaar<	To match all characters after the current virtual column (where the
100104db26b3SBram Moolenaar	cursor is): >
100204db26b3SBram Moolenaar		/\%>.v.*
1003c95a302aSBram Moolenaar<	Column 17 is not included, because this is a |/zero-width| match. To
1004c95a302aSBram Moolenaar	include the column use: >
1005c95a302aSBram Moolenaar		/^.*\%17v.
10068f3f58f2SBram Moolenaar<	This command does the same thing, but also matches when there is no
10078f3f58f2SBram Moolenaar	character in column 17: >
1008c95a302aSBram Moolenaar		/^.*\%<18v.
1009c95a302aSBram Moolenaar<	Note that without the "^" to anchor the match in the first column,
1010c95a302aSBram Moolenaar	this will also highlight column 17: >
1011c95a302aSBram Moolenaar		/.*\%17v
1012c95a302aSBram Moolenaar<	Column 17 is highlighted by 'hlsearch' because there is another match
1013c95a302aSBram Moolenaar	where ".*" matches zero characters.
1014*2286304cSBram Moolenaar
1015071d4279SBram Moolenaar
101625c9c680SBram MoolenaarCharacter classes:
1017071d4279SBram Moolenaar\i	identifier character (see 'isident' option)	*/\i*
1018071d4279SBram Moolenaar\I	like "\i", but excluding digits			*/\I*
1019071d4279SBram Moolenaar\k	keyword character (see 'iskeyword' option)	*/\k*
1020071d4279SBram Moolenaar\K	like "\k", but excluding digits			*/\K*
1021071d4279SBram Moolenaar\f	file name character (see 'isfname' option)	*/\f*
1022071d4279SBram Moolenaar\F	like "\f", but excluding digits			*/\F*
1023071d4279SBram Moolenaar\p	printable character (see 'isprint' option)	*/\p*
1024071d4279SBram Moolenaar\P	like "\p", but excluding digits			*/\P*
1025071d4279SBram Moolenaar
1026207f0093SBram MoolenaarNOTE: the above also work for multibyte characters.  The ones below only
1027071d4279SBram Moolenaarmatch ASCII characters, as indicated by the range.
1028071d4279SBram Moolenaar
1029071d4279SBram Moolenaar						*whitespace* *white-space*
1030071d4279SBram Moolenaar\s	whitespace character: <Space> and <Tab>		*/\s*
1031071d4279SBram Moolenaar\S	non-whitespace character; opposite of \s	*/\S*
1032071d4279SBram Moolenaar\d	digit:				[0-9]		*/\d*
1033071d4279SBram Moolenaar\D	non-digit:			[^0-9]		*/\D*
1034071d4279SBram Moolenaar\x	hex digit:			[0-9A-Fa-f]	*/\x*
1035071d4279SBram Moolenaar\X	non-hex digit:			[^0-9A-Fa-f]	*/\X*
1036071d4279SBram Moolenaar\o	octal digit:			[0-7]		*/\o*
1037071d4279SBram Moolenaar\O	non-octal digit:		[^0-7]		*/\O*
1038071d4279SBram Moolenaar\w	word character:			[0-9A-Za-z_]	*/\w*
1039071d4279SBram Moolenaar\W	non-word character:		[^0-9A-Za-z_]	*/\W*
1040071d4279SBram Moolenaar\h	head of word character:		[A-Za-z_]	*/\h*
1041071d4279SBram Moolenaar\H	non-head of word character:	[^A-Za-z_]	*/\H*
1042071d4279SBram Moolenaar\a	alphabetic character:		[A-Za-z]	*/\a*
1043071d4279SBram Moolenaar\A	non-alphabetic character:	[^A-Za-z]	*/\A*
1044071d4279SBram Moolenaar\l	lowercase character:		[a-z]		*/\l*
1045071d4279SBram Moolenaar\L	non-lowercase character:	[^a-z]		*/\L*
1046071d4279SBram Moolenaar\u	uppercase character:		[A-Z]		*/\u*
1047f1568ecaSBram Moolenaar\U	non-uppercase character:	[^A-Z]		*/\U*
1048071d4279SBram Moolenaar
1049071d4279SBram Moolenaar	NOTE: Using the atom is faster than the [] form.
1050071d4279SBram Moolenaar
1051071d4279SBram Moolenaar	NOTE: 'ignorecase', "\c" and "\C" are not used by character classes.
1052071d4279SBram Moolenaar
1053071d4279SBram Moolenaar			*/\_* *E63* */\_i* */\_I* */\_k* */\_K* */\_f* */\_F*
1054071d4279SBram Moolenaar			*/\_p* */\_P* */\_s* */\_S* */\_d* */\_D* */\_x* */\_X*
1055071d4279SBram Moolenaar			*/\_o* */\_O* */\_w* */\_W* */\_h* */\_H* */\_a* */\_A*
1056071d4279SBram Moolenaar			*/\_l* */\_L* */\_u* */\_U*
1057071d4279SBram Moolenaar\_x	Where "x" is any of the characters above: The character class with
1058071d4279SBram Moolenaar	end-of-line added
1059071d4279SBram Moolenaar(end of character classes)
1060071d4279SBram Moolenaar
1061071d4279SBram Moolenaar\e	matches <Esc>					*/\e*
1062071d4279SBram Moolenaar\t	matches <Tab>					*/\t*
1063071d4279SBram Moolenaar\r	matches <CR>					*/\r*
1064071d4279SBram Moolenaar\b	matches <BS>					*/\b*
1065071d4279SBram Moolenaar\n	matches an end-of-line				*/\n*
1066071d4279SBram Moolenaar	When matching in a string instead of buffer text a literal newline
1067071d4279SBram Moolenaar	character is matched.
1068071d4279SBram Moolenaar
1069071d4279SBram Moolenaar~	matches the last given substitute string	*/~* */\~*
1070071d4279SBram Moolenaar
1071071d4279SBram Moolenaar\(\)	A pattern enclosed by escaped parentheses.	*/\(* */\(\)* */\)*
1072fbc0d2eaSBram Moolenaar	E.g., "\(^a\)" matches 'a' at the start of a line.
1073fbc0d2eaSBram Moolenaar	*E51* *E54* *E55* *E872* *E873*
1074071d4279SBram Moolenaar
1075071d4279SBram Moolenaar\1      Matches the same string that was matched by	*/\1* *E65*
107625c9c680SBram Moolenaar	the first sub-expression in \( and \).
1077071d4279SBram Moolenaar	Example: "\([a-z]\).\1" matches "ata", "ehe", "tot", etc.
1078071d4279SBram Moolenaar\2      Like "\1", but uses second sub-expression,	*/\2*
1079071d4279SBram Moolenaar   ...							*/\3*
1080071d4279SBram Moolenaar\9      Like "\1", but uses ninth sub-expression.	*/\9*
1081071d4279SBram Moolenaar	Note: The numbering of groups is done based on which "\(" comes first
1082071d4279SBram Moolenaar	in the pattern (going left to right), NOT based on what is matched
1083071d4279SBram Moolenaar	first.
1084071d4279SBram Moolenaar
1085071d4279SBram Moolenaar\%(\)	A pattern enclosed by escaped parentheses.	*/\%(\)* */\%(* *E53*
1086071d4279SBram Moolenaar	Just like \(\), but without counting it as a sub-expression.  This
1087071d4279SBram Moolenaar	allows using more groups and it's a little bit faster.
1088071d4279SBram Moolenaar
1089071d4279SBram Moolenaarx	A single character, with no special meaning, matches itself
1090071d4279SBram Moolenaar
1091071d4279SBram Moolenaar							*/\* */\\*
1092071d4279SBram Moolenaar\x	A backslash followed by a single character, with no special meaning,
1093071d4279SBram Moolenaar	is reserved for future expansions
1094071d4279SBram Moolenaar
1095071d4279SBram Moolenaar[]	(with 'nomagic': \[])		*/[]* */\[]* */\_[]* */collection*
1096071d4279SBram Moolenaar\_[]
10971b884a00SBram Moolenaar	A collection.  This is a sequence of characters enclosed in square
10981b884a00SBram Moolenaar	brackets.  It matches any single character in the collection.
1099071d4279SBram Moolenaar	Example		matches ~
1100071d4279SBram Moolenaar	[xyz]		any 'x', 'y' or 'z'
1101071d4279SBram Moolenaar	[a-zA-Z]$	any alphabetic character at the end of a line
1102071d4279SBram Moolenaar	\c[a-z]$	same
1103a3e6bc93SBram Moolenaar	[А-яЁё]		Russian alphabet (with utf-8 and cp1251)
1104a3e6bc93SBram Moolenaar
1105c81e5e79SBram Moolenaar								*/[\n]*
1106071d4279SBram Moolenaar	With "\_" prepended the collection also includes the end-of-line.
1107071d4279SBram Moolenaar	The same can be done by including "\n" in the collection.  The
1108071d4279SBram Moolenaar	end-of-line is also matched when the collection starts with "^"!  Thus
1109071d4279SBram Moolenaar	"\_[^ab]" matches the end-of-line and any character but "a" and "b".
1110071d4279SBram Moolenaar	This makes it Vi compatible: Without the "\_" or "\n" the collection
1111071d4279SBram Moolenaar	does not match an end-of-line.
11128aff23a1SBram Moolenaar								*E769*
1113ae5bce1cSBram Moolenaar	When the ']' is not there Vim will not give an error message but
11148aff23a1SBram Moolenaar	assume no collection is used.  Useful to search for '['.  However, you
11155837f1f4SBram Moolenaar	do get E769 for internal searching.  And be aware that in a
11165837f1f4SBram Moolenaar	`:substitute` command the whole command becomes the pattern.  E.g.
11175837f1f4SBram Moolenaar	":s/[/x/" searches for "[/x" and replaces it with nothing.  It does
11185837f1f4SBram Moolenaar	not search for "[" and replaces it with "x"!
1119ae5bce1cSBram Moolenaar
11203ec574f2SBram Moolenaar								*E944* *E945*
1121071d4279SBram Moolenaar	If the sequence begins with "^", it matches any single character NOT
1122071d4279SBram Moolenaar	in the collection: "[^xyz]" matches anything but 'x', 'y' and 'z'.
1123071d4279SBram Moolenaar	- If two characters in the sequence are separated by '-', this is
1124071d4279SBram Moolenaar	  shorthand for the full list of ASCII characters between them.  E.g.,
11253ec574f2SBram Moolenaar	  "[0-9]" matches any decimal digit. If the starting character exceeds
11263ec574f2SBram Moolenaar	  the ending character, e.g. [c-a], E944 occurs. Non-ASCII characters
11273ec574f2SBram Moolenaar	  can be used, but the character values must not be more than 256 apart
11283ec574f2SBram Moolenaar	  in the old regexp engine. For example, searching by [\u3000-\u4000]
11293ec574f2SBram Moolenaar	  after setting re=1 emits a E945 error. Prepending \%#=2 will fix it.
1130071d4279SBram Moolenaar	- A character class expression is evaluated to the set of characters
1131071d4279SBram Moolenaar	  belonging to that character class.  The following character classes
1132071d4279SBram Moolenaar	  are supported:
11330c078fc7SBram Moolenaar		  Name	      Func	Contents ~
11340c078fc7SBram Moolenaar*[:alnum:]*	  [:alnum:]   isalnum	ASCII letters and digits
11350c078fc7SBram Moolenaar*[:alpha:]*	  [:alpha:]   isalpha  	ASCII letters
11360c078fc7SBram Moolenaar*[:blank:]*	  [:blank:]     	space and tab
11370c078fc7SBram Moolenaar*[:cntrl:]*	  [:cntrl:]   iscntrl 	ASCII control characters
11380c078fc7SBram Moolenaar*[:digit:]*	  [:digit:]     	decimal digits '0' to '9'
11390c078fc7SBram Moolenaar*[:graph:]*	  [:graph:]   isgraph	ASCII printable characters excluding
11400c078fc7SBram Moolenaar					space
11410c078fc7SBram Moolenaar*[:lower:]*	  [:lower:]   (1)	lowercase letters (all letters when
1142071d4279SBram Moolenaar					'ignorecase' is used)
11430c078fc7SBram Moolenaar*[:print:]*	  [:print:]   (2) 	printable characters including space
11440c078fc7SBram Moolenaar*[:punct:]*	  [:punct:]   ispunct	ASCII punctuation characters
11450c078fc7SBram Moolenaar*[:space:]*	  [:space:]     	whitespace characters: space, tab, CR,
11460c078fc7SBram Moolenaar					NL, vertical tab, form feed
11470c078fc7SBram Moolenaar*[:upper:]*	  [:upper:]   (3)	uppercase letters (all letters when
1148071d4279SBram Moolenaar					'ignorecase' is used)
11490c078fc7SBram Moolenaar*[:xdigit:]*	  [:xdigit:]    	hexadecimal digits: 0-9, a-f, A-F
1150071d4279SBram Moolenaar*[:return:]*	  [:return:]		the <CR> character
1151071d4279SBram Moolenaar*[:tab:]*	  [:tab:]		the <Tab> character
1152071d4279SBram Moolenaar*[:escape:]*	  [:escape:]		the <Esc> character
1153071d4279SBram Moolenaar*[:backspace:]*	  [:backspace:]		the <BS> character
1154221cd9f4SBram Moolenaar*[:ident:]*	  [:ident:]		identifier character (same as "\i")
1155221cd9f4SBram Moolenaar*[:keyword:]*	  [:keyword:]		keyword character (same as "\k")
1156221cd9f4SBram Moolenaar*[:fname:]*	  [:fname:]		file name character (same as "\f")
11571b884a00SBram Moolenaar	  The square brackets in character class expressions are additional to
11581b884a00SBram Moolenaar	  the square brackets delimiting a collection.  For example, the
11591b884a00SBram Moolenaar	  following is a plausible pattern for a UNIX filename:
11601b884a00SBram Moolenaar	  "[-./[:alnum:]_~]\+".  That is, a list of at least one character,
11611b884a00SBram Moolenaar	  each of which is either '-', '.', '/', alphabetic, numeric, '_' or
11621b884a00SBram Moolenaar	  '~'.
1163fa735342SBram Moolenaar	  These items only work for 8-bit characters, except [:lower:] and
1164207f0093SBram Moolenaar	  [:upper:] also work for multibyte characters when using the new
116503413f44SBram Moolenaar	  regexp engine.  See |two-engines|.  In the future these items may
1166207f0093SBram Moolenaar	  work for multibyte characters.  For now, to get all "alpha"
116706481427SBram Moolenaar	  characters you can use: [[:lower:][:upper:]].
11680c078fc7SBram Moolenaar
11690c078fc7SBram Moolenaar	  The "Func" column shows what library function is used.  The
11700c078fc7SBram Moolenaar	  implementation depends on the system.  Otherwise:
11710c078fc7SBram Moolenaar	  (1) Uses islower() for ASCII and Vim builtin rules for other
11724c92e75dSBram Moolenaar	  characters.
11730c078fc7SBram Moolenaar	  (2) Uses Vim builtin rules
11740c078fc7SBram Moolenaar	  (3) As with (1) but using isupper()
117526a60b45SBram Moolenaar							*/[[=* *[==]*
117626a60b45SBram Moolenaar	- An equivalence class.  This means that characters are matched that
1177522f9aebSBram Moolenaar	  have almost the same meaning, e.g., when ignoring accents.  This
1178522f9aebSBram Moolenaar	  only works for Unicode, latin1 and latin9.  The form is:
117926a60b45SBram Moolenaar		[=a=]
118026a60b45SBram Moolenaar							*/[[.* *[..]*
118126a60b45SBram Moolenaar	- A collation element.  This currently simply accepts a single
118226a60b45SBram Moolenaar	  character in the form:
118326a60b45SBram Moolenaar		[.a.]
1184071d4279SBram Moolenaar							  */\]*
1185071d4279SBram Moolenaar	- To include a literal ']', '^', '-' or '\' in the collection, put a
1186071d4279SBram Moolenaar	  backslash before it: "[xyz\]]", "[\^xyz]", "[xy\-z]" and "[xyz\\]".
1187071d4279SBram Moolenaar	  (Note: POSIX does not support the use of a backslash this way).  For
1188071d4279SBram Moolenaar	  ']' you can also make it the first character (following a possible
118925c9c680SBram Moolenaar	  "^"):  "[]xyz]" or "[^]xyz]".
1190071d4279SBram Moolenaar	  For '-' you can also make it the first or last character: "[-xyz]",
1191071d4279SBram Moolenaar	  "[^-xyz]" or "[xyz-]".  For '\' you can also let it be followed by
11920bc380a9SBram Moolenaar	  any character that's not in "^]-\bdertnoUux".  "[\xyz]" matches '\',
11930bc380a9SBram Moolenaar	  'x', 'y' and 'z'.  It's better to use "\\" though, future expansions
11940bc380a9SBram Moolenaar	  may use other characters after '\'.
1195ff034194SBram Moolenaar	- Omitting the trailing ] is not considered an error. "[]" works like
1196ff034194SBram Moolenaar	  "[]]", it matches the ']' character.
1197071d4279SBram Moolenaar	- The following translations are accepted when the 'l' flag is not
119825c9c680SBram Moolenaar	  included in 'cpoptions':
1199071d4279SBram Moolenaar		\e	<Esc>
1200071d4279SBram Moolenaar		\t	<Tab>
1201071d4279SBram Moolenaar		\r	<CR>	(NOT end-of-line!)
1202071d4279SBram Moolenaar		\b	<BS>
1203c81e5e79SBram Moolenaar		\n	line break, see above |/[\n]|
1204c0197e28SBram Moolenaar		\d123	decimal number of character
120582be4849SBram Moolenaar		\o40	octal number of character up to 0o377
1206c0197e28SBram Moolenaar		\x20	hexadecimal number of character up to 0xff
1207c0197e28SBram Moolenaar		\u20AC	hex. number of multibyte character up to 0xffff
1208c0197e28SBram Moolenaar		\U1234	hex. number of multibyte character up to 0xffffffff
1209071d4279SBram Moolenaar	  NOTE: The other backslash codes mentioned above do not work inside
1210071d4279SBram Moolenaar	  []!
1211071d4279SBram Moolenaar	- Matching with a collection can be slow, because each character in
1212071d4279SBram Moolenaar	  the text has to be compared with each character in the collection.
1213071d4279SBram Moolenaar	  Use one of the other atoms above when possible.  Example: "\d" is
121498ef233eSBram Moolenaar	  much faster than "[0-9]" and matches the same characters.  However,
121598ef233eSBram Moolenaar	  the new |NFA| regexp engine deals with this better than the old one.
1216071d4279SBram Moolenaar
1217071d4279SBram Moolenaar						*/\%[]* *E69* *E70* *E369*
1218c0197e28SBram Moolenaar\%[]	A sequence of optionally matched atoms.  This always matches.
1219071d4279SBram Moolenaar	It matches as much of the list of atoms it contains as possible.  Thus
1220071d4279SBram Moolenaar	it stops at the first atom that doesn't match.  For example: >
1221071d4279SBram Moolenaar		/r\%[ead]
1222071d4279SBram Moolenaar<	matches "r", "re", "rea" or "read".  The longest that matches is used.
1223071d4279SBram Moolenaar	To match the Ex command "function", where "fu" is required and
1224071d4279SBram Moolenaar	"nction" is optional, this would work: >
1225071d4279SBram Moolenaar		/\<fu\%[nction]\>
1226071d4279SBram Moolenaar<	The end-of-word atom "\>" is used to avoid matching "fu" in "full".
1227071d4279SBram Moolenaar	It gets more complicated when the atoms are not ordinary characters.
1228071d4279SBram Moolenaar	You don't often have to use it, but it is possible.  Example: >
1229071d4279SBram Moolenaar		/\<r\%[[eo]ad]\>
1230071d4279SBram Moolenaar<	Matches the words "r", "re", "ro", "rea", "roa", "read" and "road".
1231c81e5e79SBram Moolenaar	There can be no \(\), \%(\) or \z(\) items inside the [] and \%[] does
1232c81e5e79SBram Moolenaar	not nest.
12333577c6faSBram Moolenaar	To include a "[" use "[[]" and for "]" use []]", e.g.,: >
12343577c6faSBram Moolenaar		/index\%[[[]0[]]]
12353577c6faSBram Moolenaar<	matches "index" "index[", "index[0" and "index[0]".
1236db84e459SBram Moolenaar	{not available when compiled without the |+syntax| feature}
1237071d4279SBram Moolenaar
1238677ee689SBram Moolenaar				*/\%d* */\%x* */\%o* */\%u* */\%U* *E678*
1239c0197e28SBram Moolenaar
1240c0197e28SBram Moolenaar\%d123	Matches the character specified with a decimal number.  Must be
1241c0197e28SBram Moolenaar	followed by a non-digit.
12422346a637SBram Moolenaar\%o40	Matches the character specified with an octal number up to 0o377.
124382be4849SBram Moolenaar	Numbers below 0o40 must be followed by a non-octal digit or a
124482be4849SBram Moolenaar	non-digit.
1245c0197e28SBram Moolenaar\%x2a	Matches the character specified with up to two hexadecimal characters.
1246c0197e28SBram Moolenaar\%u20AC	Matches the character specified with up to four hexadecimal
1247c0197e28SBram Moolenaar	characters.
1248c0197e28SBram Moolenaar\%U1234abcd	Matches the character specified with up to eight hexadecimal
1249f6b40109SBram Moolenaar	characters, up to 0x7fffffff
1250071d4279SBram Moolenaar
1251071d4279SBram Moolenaar==============================================================================
1252071d4279SBram Moolenaar7. Ignoring case in a pattern					*/ignorecase*
1253071d4279SBram Moolenaar
1254071d4279SBram MoolenaarIf the 'ignorecase' option is on, the case of normal letters is ignored.
1255071d4279SBram Moolenaar'smartcase' can be set to ignore case when the pattern contains lowercase
1256071d4279SBram Moolenaarletters only.
1257071d4279SBram Moolenaar							*/\c* */\C*
1258071d4279SBram MoolenaarWhen "\c" appears anywhere in the pattern, the whole pattern is handled like
1259071d4279SBram Moolenaar'ignorecase' is on.  The actual value of 'ignorecase' and 'smartcase' is
1260071d4279SBram Moolenaarignored.  "\C" does the opposite: Force matching case for the whole pattern.
1261071d4279SBram Moolenaar{only Vim supports \c and \C}
1262071d4279SBram MoolenaarNote that 'ignorecase', "\c" and "\C" are not used for the character classes.
1263071d4279SBram Moolenaar
1264071d4279SBram MoolenaarExamples:
1265071d4279SBram Moolenaar      pattern	'ignorecase'  'smartcase'	matches ~
1266071d4279SBram Moolenaar	foo	  off		-		foo
1267071d4279SBram Moolenaar	foo	  on		-		foo Foo FOO
1268071d4279SBram Moolenaar	Foo	  on		off		foo Foo FOO
1269071d4279SBram Moolenaar	Foo	  on		on		    Foo
1270071d4279SBram Moolenaar	\cfoo	  -		-		foo Foo FOO
1271071d4279SBram Moolenaar	foo\C	  -		-		foo
1272071d4279SBram Moolenaar
1273071d4279SBram MoolenaarTechnical detail:				*NL-used-for-Nul*
1274071d4279SBram Moolenaar<Nul> characters in the file are stored as <NL> in memory.  In the display
1275071d4279SBram Moolenaarthey are shown as "^@".  The translation is done when reading and writing
1276071d4279SBram Moolenaarfiles.  To match a <Nul> with a search pattern you can just enter CTRL-@ or
1277071d4279SBram Moolenaar"CTRL-V 000".  This is probably just what you expect.  Internally the
1278071d4279SBram Moolenaarcharacter is replaced with a <NL> in the search pattern.  What is unusual is
1279071d4279SBram Moolenaarthat typing CTRL-V CTRL-J also inserts a <NL>, thus also searches for a <Nul>
128025c9c680SBram Moolenaarin the file.
1281071d4279SBram Moolenaar
1282071d4279SBram Moolenaar						*CR-used-for-NL*
1283071d4279SBram MoolenaarWhen 'fileformat' is "mac", <NL> characters in the file are stored as <CR>
1284e37d50a5SBram Moolenaarcharacters internally.  In the text they are shown as "^J".  Otherwise this
1285071d4279SBram Moolenaarworks similar to the usage of <NL> for a <Nul>.
1286071d4279SBram Moolenaar
1287071d4279SBram MoolenaarWhen working with expression evaluation, a <NL> character in the pattern
1288071d4279SBram Moolenaarmatches a <NL> in the string.  The use of "\n" (backslash n) to match a <NL>
1289071d4279SBram Moolenaardoesn't work there, it only works to match text in the buffer.
1290071d4279SBram Moolenaar
1291207f0093SBram Moolenaar				*pattern-multi-byte* *pattern-multibyte*
1292207f0093SBram MoolenaarPatterns will also work with multibyte characters, mostly as you would
1293071d4279SBram Moolenaarexpect.  But invalid bytes may cause trouble, a pattern with an invalid byte
1294071d4279SBram Moolenaarwill probably never match.
1295071d4279SBram Moolenaar
1296071d4279SBram Moolenaar==============================================================================
1297362e1a30SBram Moolenaar8. Composing characters					*patterns-composing*
1298362e1a30SBram Moolenaar
1299362e1a30SBram Moolenaar							*/\Z*
13008df5acfdSBram MoolenaarWhen "\Z" appears anywhere in the pattern, all composing characters are
13018df5acfdSBram Moolenaarignored.  Thus only the base characters need to match, the composing
13028df5acfdSBram Moolenaarcharacters may be different and the number of composing characters may differ.
13038df5acfdSBram MoolenaarOnly relevant when 'encoding' is "utf-8".
1304543b7ef7SBram MoolenaarException: If the pattern starts with one or more composing characters, these
1305543b7ef7SBram Moolenaarmust match.
13068df5acfdSBram Moolenaar							*/\%C*
13078df5acfdSBram MoolenaarUse "\%C" to skip any composing characters.  For example, the pattern "a" does
13088df5acfdSBram Moolenaarnot match in "càt" (where the a has the composing character 0x0300), but
13098df5acfdSBram Moolenaar"a\%C" does.  Note that this does not match "cát" (where the á is character
13108df5acfdSBram Moolenaar0xe1, it does not have a compositing character).  It does match "cat" (where
13118df5acfdSBram Moolenaarthe a is just an a).
1312362e1a30SBram Moolenaar
13137ff78465SBram MoolenaarWhen a composing character appears at the start of the pattern or after an
1314362e1a30SBram Moolenaaritem that doesn't include the composing character, a match is found at any
1315362e1a30SBram Moolenaarcharacter that includes this composing character.
1316362e1a30SBram Moolenaar
1317362e1a30SBram MoolenaarWhen using a dot and a composing character, this works the same as the
1318362e1a30SBram Moolenaarcomposing character by itself, except that it doesn't matter what comes before
1319362e1a30SBram Moolenaarthis.
1320362e1a30SBram Moolenaar
1321543b7ef7SBram MoolenaarThe order of composing characters does not matter.  Also, the text may have
1322543b7ef7SBram Moolenaarmore composing characters than the pattern, it still matches.  But all
1323543b7ef7SBram Moolenaarcomposing characters in the pattern must be found in the text.
1324543b7ef7SBram Moolenaar
1325543b7ef7SBram MoolenaarSuppose B is a base character and x and y are composing characters:
1326543b7ef7SBram Moolenaar	pattern		text		match ~
1327543b7ef7SBram Moolenaar	Bxy		Bxy		yes (perfect match)
1328543b7ef7SBram Moolenaar	Bxy		Byx		yes (order ignored)
1329543b7ef7SBram Moolenaar	Bxy		By		no (x missing)
1330543b7ef7SBram Moolenaar	Bxy		Bx		no (y missing)
1331203d04d7SBram Moolenaar	Bx		Bx		yes (perfect match)
1332543b7ef7SBram Moolenaar	Bx		By		no (x missing)
1333543b7ef7SBram Moolenaar	Bx		Bxy		yes (extra y ignored)
1334543b7ef7SBram Moolenaar	Bx		Byx		yes (extra y ignored)
1335362e1a30SBram Moolenaar
1336362e1a30SBram Moolenaar==============================================================================
1337362e1a30SBram Moolenaar9. Compare with Perl patterns				*perl-patterns*
1338071d4279SBram Moolenaar
1339071d4279SBram MoolenaarVim's regexes are most similar to Perl's, in terms of what you can do.  The
1340071d4279SBram Moolenaardifference between them is mostly just notation;  here's a summary of where
1341071d4279SBram Moolenaarthey differ:
1342071d4279SBram Moolenaar
1343071d4279SBram MoolenaarCapability			in Vimspeak	in Perlspeak ~
1344071d4279SBram Moolenaar----------------------------------------------------------------
1345071d4279SBram Moolenaarforce case insensitivity	\c		(?i)
1346071d4279SBram Moolenaarforce case sensitivity		\C		(?-i)
1347362e1a30SBram Moolenaarbackref-less grouping		\%(atom\)	(?:atom)
1348071d4279SBram Moolenaarconservative quantifiers	\{-n,m}		*?, +?, ??, {}?
1349071d4279SBram Moolenaar0-width match			atom\@=		(?=atom)
1350071d4279SBram Moolenaar0-width non-match		atom\@!		(?!atom)
1351071d4279SBram Moolenaar0-width preceding match		atom\@<=	(?<=atom)
1352071d4279SBram Moolenaar0-width preceding non-match	atom\@<!	(?<!atom)
1353071d4279SBram Moolenaarmatch without retry		atom\@>		(?>atom)
1354071d4279SBram Moolenaar
1355071d4279SBram MoolenaarVim and Perl handle newline characters inside a string a bit differently:
1356071d4279SBram Moolenaar
1357071d4279SBram MoolenaarIn Perl, ^ and $ only match at the very beginning and end of the text,
1358071d4279SBram Moolenaarby default, but you can set the 'm' flag, which lets them match at
1359071d4279SBram Moolenaarembedded newlines as well.  You can also set the 's' flag, which causes
1360071d4279SBram Moolenaara . to match newlines as well.  (Both these flags can be changed inside
1361071d4279SBram Moolenaara pattern using the same syntax used for the i flag above, BTW.)
1362071d4279SBram Moolenaar
1363071d4279SBram MoolenaarOn the other hand, Vim's ^ and $ always match at embedded newlines, and
1364071d4279SBram Moolenaaryou get two separate atoms, \%^ and \%$, which only match at the very
1365071d4279SBram Moolenaarstart and end of the text, respectively.  Vim solves the second problem
1366071d4279SBram Moolenaarby giving you the \_ "modifier":  put it in front of a . or a character
1367071d4279SBram Moolenaarclass, and they will match newlines as well.
1368071d4279SBram Moolenaar
1369071d4279SBram MoolenaarFinally, these constructs are unique to Perl:
1370071d4279SBram Moolenaar- execution of arbitrary code in the regex:  (?{perl code})
1371071d4279SBram Moolenaar- conditional expressions:  (?(condition)true-expr|false-expr)
1372071d4279SBram Moolenaar
1373071d4279SBram Moolenaar...and these are unique to Vim:
1374071d4279SBram Moolenaar- changing the magic-ness of a pattern:  \v \V \m \M
1375071d4279SBram Moolenaar   (very useful for avoiding backslashitis)
1376071d4279SBram Moolenaar- sequence of optionally matching atoms:  \%[atoms]
1377071d4279SBram Moolenaar- \& (which is to \| what "and" is to "or";  it forces several branches
1378071d4279SBram Moolenaar   to match at one spot)
1379071d4279SBram Moolenaar- matching lines/columns by number:  \%5l \%5c \%5v
1380362e1a30SBram Moolenaar- setting the start and end of the match:  \zs \ze
1381071d4279SBram Moolenaar
1382071d4279SBram Moolenaar==============================================================================
1383362e1a30SBram Moolenaar10. Highlighting matches				*match-highlight*
1384071d4279SBram Moolenaar
1385071d4279SBram Moolenaar							*:mat* *:match*
1386071d4279SBram Moolenaar:mat[ch] {group} /{pattern}/
1387071d4279SBram Moolenaar		Define a pattern to highlight in the current window.  It will
1388071d4279SBram Moolenaar		be highlighted with {group}.  Example: >
1389071d4279SBram Moolenaar			:highlight MyGroup ctermbg=green guibg=green
1390071d4279SBram Moolenaar			:match MyGroup /TODO/
1391071d4279SBram Moolenaar<		Instead of // any character can be used to mark the start and
1392071d4279SBram Moolenaar		end of the {pattern}.  Watch out for using special characters,
1393071d4279SBram Moolenaar		such as '"' and '|'.
1394fd2ac767SBram Moolenaar
1395071d4279SBram Moolenaar		{group} must exist at the moment this command is executed.
1396fd2ac767SBram Moolenaar
1397fd2ac767SBram Moolenaar		The {group} highlighting still applies when a character is
13986ee10162SBram Moolenaar		to be highlighted for 'hlsearch', as the highlighting for
13996ee10162SBram Moolenaar		matches is given higher priority than that of 'hlsearch'.
14006ee10162SBram Moolenaar		Syntax highlighting (see 'syntax') is also overruled by
14016ee10162SBram Moolenaar		matches.
1402fd2ac767SBram Moolenaar
1403071d4279SBram Moolenaar		Note that highlighting the last used search pattern with
1404071d4279SBram Moolenaar		'hlsearch' is used in all windows, while the pattern defined
1405071d4279SBram Moolenaar		with ":match" only exists in the current window.  It is kept
1406071d4279SBram Moolenaar		when switching to another buffer.
1407fd2ac767SBram Moolenaar
1408fd2ac767SBram Moolenaar		'ignorecase' does not apply, use |/\c| in the pattern to
1409fd2ac767SBram Moolenaar		ignore case.  Otherwise case is not ignored.
1410fd2ac767SBram Moolenaar
14113577c6faSBram Moolenaar		'redrawtime' defines the maximum time searched for pattern
14123577c6faSBram Moolenaar		matches.
14133577c6faSBram Moolenaar
1414c81e5e79SBram Moolenaar		When matching end-of-line and Vim redraws only part of the
1415c81e5e79SBram Moolenaar		display you may get unexpected results.  That is because Vim
1416c81e5e79SBram Moolenaar		looks for a match in the line where redrawing starts.
1417c81e5e79SBram Moolenaar
14186ee10162SBram Moolenaar		Also see |matcharg()| and |getmatches()|. The former returns
14196ee10162SBram Moolenaar		the highlight group and pattern of a previous |:match|
14206ee10162SBram Moolenaar		command.  The latter returns a list with highlight groups and
14216ee10162SBram Moolenaar		patterns defined by both |matchadd()| and |:match|.
14226ee10162SBram Moolenaar
14236ee10162SBram Moolenaar		Highlighting matches using |:match| are limited to three
14246ee10162SBram Moolenaar		matches (aside from |:match|, |:2match| and |:3match| are
14256ee10162SBram Moolenaar		available). |matchadd()| does not have this limitation and in
14266ee10162SBram Moolenaar		addition makes it possible to prioritize matches.
1427910f66f9SBram Moolenaar
1428071d4279SBram Moolenaar		Another example, which highlights all characters in virtual
1429071d4279SBram Moolenaar		column 72 and more: >
1430071d4279SBram Moolenaar			:highlight rightMargin term=bold ctermfg=blue guifg=blue
1431071d4279SBram Moolenaar			:match rightMargin /.\%>72v/
1432071d4279SBram Moolenaar<		To highlight all character that are in virtual column 7: >
1433071d4279SBram Moolenaar			:highlight col8 ctermbg=grey guibg=grey
1434071d4279SBram Moolenaar			:match col8 /\%<8v.\%>7v/
1435071d4279SBram Moolenaar<		Note the use of two items to also match a character that
1436071d4279SBram Moolenaar		occupies more than one virtual column, such as a TAB.
1437071d4279SBram Moolenaar
1438071d4279SBram Moolenaar:mat[ch]
1439071d4279SBram Moolenaar:mat[ch] none
1440071d4279SBram Moolenaar		Clear a previously defined match pattern.
1441071d4279SBram Moolenaar
1442fd2ac767SBram Moolenaar
1443910f66f9SBram Moolenaar:2mat[ch] {group} /{pattern}/					*:2match*
1444fd2ac767SBram Moolenaar:2mat[ch]
1445fd2ac767SBram Moolenaar:2mat[ch] none
1446910f66f9SBram Moolenaar:3mat[ch] {group} /{pattern}/					*:3match*
1447fd2ac767SBram Moolenaar:3mat[ch]
1448fd2ac767SBram Moolenaar:3mat[ch] none
1449fd2ac767SBram Moolenaar		Just like |:match| above, but set a separate match.  Thus
1450fd2ac767SBram Moolenaar		there can be three matches active at the same time.  The match
1451fd2ac767SBram Moolenaar		with the lowest number has priority if several match at the
1452fd2ac767SBram Moolenaar		same position.
1453fd2ac767SBram Moolenaar		The ":3match" command is used by the |matchparen| plugin.  You
1454fd2ac767SBram Moolenaar		are suggested to use ":match" for manual matching and
1455fd2ac767SBram Moolenaar		":2match" for another plugin.
1456fd2ac767SBram Moolenaar
14573ec3217fSBram Moolenaar==============================================================================
14583ec3217fSBram Moolenaar11. Fuzzy matching					*fuzzy-match*
14593ec3217fSBram Moolenaar
14603ec3217fSBram MoolenaarFuzzy matching refers to matching strings using a non-exact search string.
14613ec3217fSBram MoolenaarFuzzy matching will match a string, if all the characters in the search string
14623ec3217fSBram Moolenaarare present anywhere in the string in the same order. Case is ignored.  In a
14633ec3217fSBram Moolenaarmatched string, other characters can be present between two consecutive
14643ec3217fSBram Moolenaarcharacters in the search string. If the search string has multiple words, then
14653ec3217fSBram Moolenaareach word is matched separately. So the words in the search string can be
14663ec3217fSBram Moolenaarpresent in any order in a string.
14673ec3217fSBram Moolenaar
14683ec3217fSBram MoolenaarFuzzy matching assigns a score for each matched string based on the following
14693ec3217fSBram Moolenaarcriteria:
14703ec3217fSBram Moolenaar    - The number of sequentially matching characters.
14713ec3217fSBram Moolenaar    - The number of characters (distance) between two consecutive matching
14723ec3217fSBram Moolenaar      characters.
14733ec3217fSBram Moolenaar    - Matches at the beginning of a word
147453f7fcccSBram Moolenaar    - Matches at a camel case character (e.g. Case in CamelCase)
147553f7fcccSBram Moolenaar    - Matches after a path separator or a hyphen.
14763ec3217fSBram Moolenaar    - The number of unmatched characters in a string.
14773ec3217fSBram MoolenaarThe matching string with the highest score is returned first.
14783ec3217fSBram Moolenaar
14793ec3217fSBram MoolenaarFor example, when you search for the "get pat" string using fuzzy matching, it
14803ec3217fSBram Moolenaarwill match the strings "GetPattern", "PatternGet", "getPattern", "patGetter",
14813ec3217fSBram Moolenaar"getSomePattern", "MatchpatternGet" etc.
14823ec3217fSBram Moolenaar
14833ec3217fSBram MoolenaarThe functions |matchfuzzy()| and |matchfuzzypos()| can be used to fuzzy search
14843ec3217fSBram Moolenaara string in a List of strings. The matchfuzzy() function returns a List of
14853ec3217fSBram Moolenaarmatching strings. The matchfuzzypos() functions returns the List of matches,
14863ec3217fSBram Moolenaarthe matching positions and the fuzzy match scores.
14873ec3217fSBram Moolenaar
14883ec3217fSBram MoolenaarThe "f" flag of `:vimgrep` enables fuzzy matching.
14893ec3217fSBram Moolenaar
14903ec3217fSBram Moolenaar
1491fd2ac767SBram Moolenaar
149291f84f6eSBram Moolenaar vim:tw=78:ts=8:noet:ft=help:norl:
1493