Back to Cppfront

Supported regular expression features

docs/notes/regex_status.md

0.8.116.3 KB
Original Source

Supported regular expression features

The listings are taken from the Perl regex docs. Regular expressions are applied via the regex metafunction.

Currently supported or planned features

Modifiers

ModifierNotesStatus
iDo case-insensitive pattern matching. For example, "A" will match "a" under /i.<span style="color:green">Supported</span>
mTreat the string being matched against as multiple lines. That is, change ^ and $ from matching the start of the string's first line and the end of its last line to matching the start and end of each line within the string.<span style="color:green">Supported</span>
sTreat the string as single line. That is, change . to match any character whatsoever, even a newline, which normally it would not match.<span style="color:green">Supported</span>
*x and xxExtend your pattern's legibility by permitting whitespace and comments. For details see: Perl regex docs: /x and /xx.<span style="color:green">Supported</span>
nPrevent the grouping metacharacters ( and ) from capturing. This modifier will stop $1, $2, etc. from being filled in.<span style="color:green">Supported</span>
cKeep the current position during repeated matching.<span style="color:green">Supported</span>

Escape sequences (Complete)

Escape sequenceNotesStatus
\tTab (HT, TAB)X<span style="color:green">Supported</span>
\nNewline (LF, NL)<span style="color:green">Supported</span>
\rReturn (CR)<span style="color:green">Supported</span>
\fForm feed (FF)<span style="color:green">Supported</span>
\aAlarm (bell) (BEL)<span style="color:green">Supported</span>
\eEscape (think troff) (ESC)<span style="color:green">Supported</span>
\x{}, \x00Character whose ordinal is the given hexadecimal number<span style="color:green">Supported</span>
\o{}, \000Character whose ordinal is the given octal number<span style="color:green">Supported</span>

Quantifiers (Complete)

QuantifierNotesStatus
*Match 0 or more times<span style="color:green">Supported</span>
+Match 1 or more times<span style="color:green">Supported</span>
?Match 1 or 0 times<span style="color:green">Supported</span>
{n}Match exactly n times<span style="color:green">Supported</span>
{n,}Match at least n times<span style="color:green">Supported</span>
{,n}Match at most n times<span style="color:green">Supported</span>
{n,m}Match at least n but not more than m times<span style="color:green">Supported</span>
*?Match 0 or more times, not greedily<span style="color:green">Supported</span>
+?Match 1 or more times, not greedily<span style="color:green">Supported</span>
??Match 0 or 1 time, not greedily<span style="color:green">Supported</span>
{n}?Match exactly n times, not greedily (redundant)<span style="color:green">Supported</span>
{n,}?Match at least n times, not greedily<span style="color:green">Supported</span>
{,n}?Match at most n times, not greedily<span style="color:green">Supported</span>
{n,m}?Match at least n but not more than m times, not greedily<span style="color:green">Supported</span>
*+Match 0 or more times and give nothing back<span style="color:green">Supported</span>
++Match 1 or more times and give nothing back<span style="color:green">Supported</span>
?+Match 0 or 1 time and give nothing back<span style="color:green">Supported</span>
{n}+Match exactly n times and give nothing back (redundant)<span style="color:green">Supported</span>
{n,}+Match at least n times and give nothing back<span style="color:green">Supported</span>
{,n}+Match at most n times and give nothing back<span style="color:green">Supported</span>
{n,m}+Match at least n but not more than m times and give nothing back<span style="color:green">Supported</span>

Character Classes and other Special Escapes (Complete)

FeatureNotesStatus
[...]Match a character according to the rules of the bracketed character class defined by the "...". Example: [a-z] matches "a" or "b" or "c" ... or "z"<span style="color:green">Supported</span>
[[:...:]]Match a character according to the rules of the POSIX character class "..." within the outer bracketed character class. Example: [[:upper:]] matches any uppercase character.<span style="color:green">Supported</span>
\g1 or \g{-1}Backreference to a specific or previous group. The number may be negative indicating a relative previous group and may optionally be wrapped in curly brackets for safer parsing.<span style="color:green">Supported</span>
\g{name}Named backreference<span style="color:green">Supported</span>
\k<name>Named backreference<span style="color:green">Supported</span>
\k'name'Named backreference<span style="color:green">Supported</span>
\k{name}Named backreference<span style="color:green">Supported</span>
\wMatch a "word" character (alphanumeric plus "_", plus other connector punctuation chars plus Unicode marks)<span style="color:green">Supported</span>
\WMatch a non-"word" character<span style="color:green">Supported</span>
\sMatch a whitespace character<span style="color:green">Supported</span>
\SMatch a non-whitespace character<span style="color:green">Supported</span>
\dMatch a decimal digit character<span style="color:green">Supported</span>
\DMatch a non-digit character<span style="color:green">Supported</span>
\vVertical whitespace<span style="color:green">Supported</span>
\VNot vertical whitespace<span style="color:green">Supported</span>
\hHorizontal whitespace<span style="color:green">Supported</span>
\HNot horizontal whitespace<span style="color:green">Supported</span>
\1Backreference to a specific capture group or buffer. '1' may actually be any positive integer.<span style="color:green">Supported</span>
\NAny character but \n. Not affected by /s modifier<span style="color:green">Supported</span>
\KKeep the stuff left of the \K, don't include it in $&<span style="color:green">Supported</span>

Assertions

AssertionNotesStatus
\bMatch a \w\W or \W\w boundary<span style="color:green">Supported</span>
\BMatch except at a \w\W or \W\w boundary<span style="color:green">Supported</span>
\AMatch only at beginning of string<span style="color:green">Supported</span>
\ZMatch only at end of string, or before newline at the end<span style="color:green">Supported</span>
\zMatch only at end of string<span style="color:green">Supported</span>
\GMatch only at pos() (e.g. at the end-of-match position of prior m//g)<span style="color:green">Supported</span>

Capture groups (Complete)

FeatureStatus
(...)<span style="color:green">Supported</span>

Quoting metacharacters (Complete)

FeatureStatus
**For `^.[]$()*{}?+`**

Extended Patterns

Extended patternNotesStatus
(?<NAME>pattern)Named capture group<span style="color:green">Supported</span>
(?#text)Comments<span style="color:green">Supported</span>
(?adlupimnsx-imnsx)Modification for surrounding context<span style="color:green">Supported</span>
(?^alupimnsx)Modification for surrounding context<span style="color:green">Supported</span>
(?:pattern)Clustering, does not generate a group index.<span style="color:green">Supported</span>
(?adluimnsx-imnsx:pattern)Clustering, does not generate a group index and modifications for the cluster.<span style="color:green">Supported</span>
(?^aluimnsx:pattern)Clustering, does not generate a group index and modifications for the cluster.<span style="color:green">Supported</span>
(?<code>|</code>pattern)Branch reset<span style="color:green">Supported</span>
(?'NAME'pattern)Named capture group<span style="color:green">Supported</span>
(?(condition)yes-pattern<code>|</code>no-pattern)Conditional patterns.<span style="color:gray">Planned</span>
(?(condition)yes-pattern)Conditional patterns.<span style="color:gray">Planned</span>
(?>pattern)Atomic patterns. (Disable backtrack.)<span style="color:green">Supported</span>
(*atomic:pattern)Atomic patterns. (Disable backtrack.)<span style="color:green">Supported</span>

Lookaround Assertions

Lookaround assertionNotesStatus
(?=pattern)Positive look ahead.<span style="color:green">Supported</span>
(*pla:pattern)Positive look ahead.<span style="color:green">Supported</span>
(*positive_lookahead:pattern)Positive look ahead.<span style="color:green">Supported</span>
(?!pattern)Negative look ahead.<span style="color:green">Supported</span>
(*nla:pattern)Negative look ahead.<span style="color:green">Supported</span>
(*negative_lookahead:pattern)Negative look ahead.<span style="color:green">Supported</span>
(?<=pattern)Positive look behind.<span style="color:green">Supported</span>
(*plb:pattern)Positive look behind.<span style="color:green">Supported</span>
(*positive_lookbehind:pattern)Positive look behind.<span style="color:green">Supported</span>
(?<!pattern)Negative look behind.<span style="color:green">Supported</span>
(*nlb:pattern)Negative look behind.<span style="color:green">Supported</span>
(*negative_lookbehind:pattern)Negative look behind.<span style="color:green">Supported</span>

Special Backtracking Control Verbs

Backtracking control verbNotesStatus
(*SKIP) (*SKIP:NAME)Start next search here.<span style="color:gray">Planned</span>
(*PRUNE) (*PRUNE:NAME)No backtracking over this point.<span style="color:gray">Planned</span>
(*MARK:NAME) (*:NAME)Place a named mark.<span style="color:gray">Planned</span>
(*THEN) (*THEN:NAME)Like PRUNE.<span style="color:gray">Planned</span>
(*COMMIT) (*COMMIT:arg)Stop searching.<span style="color:gray">Planned</span>
(*FAIL) (*F) (*FAIL:arg)Fail the pattern/branch.<span style="color:gray">Planned</span>
(*ACCEPT) (*ACCEPT:arg)Accept the pattern/subpattern.<span style="color:gray">Planned</span>

Not planned (Mainly because of Unicode or perl specifics)

Modifiers

ModifierNotesStatus
pPreserve the string matched such that ${^PREMATCH}, ${^MATCH}, and ${^POSTMATCH} are available for use after matching.<span style="color:darkred">Not planned</span>
a, d, l, and uThese modifiers affect which character-set rules (Unicode, etc.) are used, as described below in "Character set modifiers".<span style="color:darkred">Not planned</span>
gglobally match the pattern repeatedly in the string<span style="color:darkred">Not planned</span>
eevaluate the right-hand side as an expression<span style="color:darkred">Not planned</span>
eeevaluate the right side as a string then eval the result<span style="color:darkred">Not planned</span>
opretend to optimize your code, but actually introduce bugs<span style="color:darkred">Not planned</span>
rperform non-destructive substitution and return the new value<span style="color:darkred">Not planned</span>

Escape sequences

Escape sequenceNotesStatus
\cKcontrol char (example: VT)<span style="color:darkred">Not planned</span>
\N{name}named Unicode character or character sequence<span style="color:darkred">Not planned</span>
\N{U+263D}Unicode character (example: FIRST QUARTER MOON)<span style="color:darkred">Not planned</span>
\llowercase next char (think vi)<span style="color:darkred">Not planned</span>
\uuppercase next char (think vi)<span style="color:darkred">Not planned</span>
\Llowercase until \E (think vi)<span style="color:darkred">Not planned</span>
\Uuppercase until \E (think vi)<span style="color:darkred">Not planned</span>
\Qquote (disable) pattern metacharacters until \E<span style="color:darkred">Not planned</span>
\Eend either case modification or quoted section, think vi<span style="color:darkred">Not planned</span>

Character Classes and other Special Escapes

Character class or escapeNotesStatus
(?[...])Extended bracketed character class<span style="color:darkred">Not planned</span>
\pPMatch P, named property. Use \p{Prop} for longer names<span style="color:darkred">Not planned</span>
\PPMatch non-P<span style="color:darkred">Not planned</span>
\XMatch Unicode "eXtended grapheme cluster"<span style="color:darkred">Not planned</span>
\RLinebreak<span style="color:darkred">Not planned</span>

Assertions

AssertionNotesStatus
\b{}Match at Unicode boundary of specified type<span style="color:darkred">Not planned</span>
\B{}Match where corresponding \b{} doesn't match<span style="color:darkred">Not planned</span>

Extended Patterns

Extended patternNotesStatus
(?{ code })Perl code execution.<span style="color:darkred">Not planned</span>
(*{ code })Perl code execution.<span style="color:darkred">Not planned</span>
(??{ code })Perl code execution.<span style="color:darkred">Not planned</span>
(?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)Recursive subpattern.<span style="color:darkred">Not planned</span>
(?&NAME)Recursive subpattern.<span style="color:darkred">Not planned</span>

Script runs

Script runsNotesStatus
(*script_run:pattern)All chars in pattern need to be of the same script.<span style="color:darkred">Not planned</span>
(*sr:pattern)All chars in pattern need to be of the same script.<span style="color:darkred">Not planned</span>
(*atomic_script_run:pattern)Without backtracking.<span style="color:darkred">Not planned</span>
(*asr:pattern)Without backtracking.<span style="color:darkred">Not planned</span>