If you’ve done any software or web development, chances are you’ve encountered regular expressions (RegEx), which are powerful tools for pattern matching in text.

They might seem intimidating at first, but in many cases, they’re surprisingly straightforward. Regular expressions are also remarkably flexible, capable of handling complex matching tasks, especially when using some of the lesser-known tokens and sequences.
One reason RegEx can be confusing is that syntax varies across implementations or engines. For instance, JavaScript uses the ECMAScript
flavor, which closely resembles .NET
syntax but isn’t identical. Some of the tools I use — like Arq Backup — rely on ICU
(International Components for Unicode), which brings its own quirks to the table.
I frequently turn to regex101 when writing or testing complex expressions. It supports multiple flavors, making it invaluable for comparison. For ICU
, I reference the official ICU Documentation.
If you’re unsure which flavor applies to your environment, here’s a quick guide:
.NET
- Used in C# and other .NET languagesECMAScript
- Covers JavaScript, JScript, TypeScript, and ActionScriptICU
- Used in Arq BackupPCRE
- (Perl Compatible Regular Expressions) for PHP versions prior to 7.3PCRE2
- Used in PHP 7.3 and newer
Below is a table showcasing some of the most widely used RegEx flavors for quick reference. Since there are too many to show at once, use the buttons to explore those relevant to your needs.
Token | .NET (C#) | ECMAScript | Golang | ICU | Java 8 | PCRE | PCRE2 | Python | Rust |
---|---|---|---|---|---|---|---|---|---|
General | |||||||||
Newline | \n | \n | \n | \n | \n | \n | \n | \n | \n |
Carriage return | \r | \r | \r | \r | \r | \r | \r | \r | \r |
Tab | \t | \t | \t | \t | \t | \t | \t | \t | \t |
Null character | \0 | \0 | \0 | \0 | \0 | ||||
Escape | \e | ||||||||
Character Classes | |||||||||
A single character of: a , b or c | [abc] | [abc] | [abc] | [abc] | [abc] | [abc] | [abc] | [abc] | [abc] |
A single character of: a , b , c or
d | [[ab][cd]] | [[ab][cd]] | |||||||
A character except: a , b or c | [^abc] | [^abc] | [^abc] | [^abc] | [^abc] | [^abc] | [^abc] | [^abc] | [^abc] |
A character in the range: a -z | [a-z] | [a-z] | [a-z] | [a-z] | [a-z] | [a-z] | [a-z] | [a-z] | [a-z] |
A character not in the range: a -z | [^a-z] | [^a-z] | [^a-z] | [^a-z] | [^a-z] | [^a-z] | [^a-z] | [^a-z] | |
A character in the range: a -z or A -Z | [a-zA-Z] | [a-zA-Z] | [a-zA-Z] | [a-zA-Z] | [a-zA-Z] | [a-zA-Z] | [a-zA-Z] | [a-zA-Z] | [a-zA-Z] |
A union of sets | [[a-z][A-Z][0-9]] | ||||||||
A character in the Unicode range | [\u0000-\U0010ffff] | ||||||||
Character class intersection | [\w&&[^\d]] | [\w&&[^\d]] | [\w&&[^\d]] | ||||||
Character class subtraction | [a-z-[bcde]] | [a-z--[bcde]] | [a-z--[bcde]] | ||||||
Character class relative complement | [[a-f]~~[e-h]] | ||||||||
Letters and digits | [[:alnum:]] | [[:alnum:]] | [[:alnum:]] | [[:alnum:]] | |||||
Letters | [[:alpha:]] | [\p{Letter}] | [[:alpha:]] | [[:alpha:]] | [[:alpha:]] | ||||
Non-letters | [\P{Letter}] | ||||||||
Intersection of Letters and Cyrillic (or all Cyrillic letters) | [\p{Letter}\p{script=cyrillic}] | ||||||||
Subtraction of Letters and Latin (or all non-Latin letters) | [\p{Letter}--\p{script=latin} | ||||||||
ASCII codes 0-127 | [[:ascii:]] | [[:ascii:]] | [[:ascii:]] | [[:ascii:]] | |||||
Space or tab only | [[:blank:]] | [[:blank:]] | [[:blank:]] | [[:blank:]] | |||||
Control characters | [[:cntrl:]] | [[:cntrl:]] | [[:cntrl:]] | [[:cntrl:]] | |||||
Decimal digits | [[:digit:]] | [[:digit:]] | [[:digit:]] | [[:digit:]] | |||||
Visible characters (not space) | [[:graph:]] | [[:graph:]] | [[:graph:]] | [[:graph:]] | |||||
Lowercase letters | [[:lower:]] | [[:lower:]] | [[:lower:]] | [[:lower:]] | |||||
Visible characters | [[:print:]] | [[:print:]] | [[:print:]] | [[:print:]] | |||||
Visible punctuation characters | [[:punct:]] | [[:punct:]] | [[:punct:]] | [[:punct:]] | |||||
Whitespace | [[:space:]] | [[:space:]] | [[:space:]] | [[:space:]] | |||||
Uppercase letters | [[:upper:]] | [[:upper:]] | [[:upper:]] | [[:upper:]] | |||||
Word characters | [[:word:]] | [[:word:]] | [[:word:]] | [[:word:]] | |||||
Hexadecimal digits | [[:xdigit:]] | [[:xdigit:]] | [[:xdigit:]] | [[:xdigit:]] | |||||
Start of word | [[:<:]] | [[:<:]] | |||||||
End of word | [[:>:]] | [[:>:]] | |||||||
Meta Sequence | |||||||||
Any single character | . | . | . | . | . | . | . | . | |
Alternate - match either a or b | a|b | a|b | a|b | a|b | a|b | a|b | a|b | a|b | a|b |
Any whitespace character | \s | \s | \s | \s | \s | \s | \s | \s | \s |
Any non-whitespace character | \S | \S | \S | \S | \S | \S | \S | \S | \S |
Any digit | \d | \d | \d | \d | \d | \d | \d | \d | \d |
Any non-digit | \D | \D | \D | \D | \D | \D | \D | \D | \D |
Any word character | \w | \w | \w | \w | \w | \w | \w | \w | \w |
Any non-word character | \W | \W | \W | \W | \W | \W | \W | \W | \W |
Any Unicode sequences, linebreaks included | \X | \X | |||||||
Match one data unit | \C | \C | |||||||
Unicode newlines | \R | \R | \R | \R | |||||
Match anything but a newline | \N | \N | |||||||
Vertical whitespace character | \v | \v | \v | \v | \v | \v | \v | \v | \v |
Negation of \v | \V | \V | \V | \V | |||||
Horizontal whitespace character | \h | \h | \h | \h | |||||
Negation of \h | \H | \H | \H | \H | |||||
Match a Grapheme Cluster | \X | ||||||||
Reset match | \K | \K | |||||||
Match subpattern number # | \# | \# | \# | \# | \# | \# | |||
Unicode character name | \N{...} | ||||||||
Unicode property X | \pX | \pX | \pX | \pX | |||||
Unicode property or script category | \p{...} | \p{...} | \p{...} | \p{...} | \p{...} | \p{...} | \p{...} | \p{...} | |
Unicode property or script category (alternate POSIX-like syntax) | [:script=...:] | ||||||||
Negation of \pX | \PX | \PX | \PX | \PX | |||||
Negation of \p | \P{...} | \P{...} | \P{...} | \P{...} | \P{...} | \P{...} | \P{...} | \P{...} | |
Quote; treat as literals | \Q...\E | \Q...\E | \Q...\E | \Q...\E | |||||
Match subpattern name | \k{name} | \k{name} | \k{name} | ||||||
Match subpattern name | \k<name> | \k<name> | \k<name> | \k<name> | \k<name> | \k<name> | |||
Match subpattern name | \k'name' | \k'name' | |||||||
Match nth subpattern | \gn | \gn | |||||||
Match nth subpattern | \g{n} | \g{n} | |||||||
Match text the nth relative previous subpattern matched | \g{-n} | \g{-n} | |||||||
Match expression defined in the nth capture group | \g<n> | \g<n> | |||||||
Match expression defined in the nth relative upcoming capture group | \g<+n> | \g<+n> | |||||||
Match expression defined in the nth capture group. | \g'n' | \g'n' | |||||||
Match expression defined in the nth relative upcoming subpattern | \g'+n' | \g'+n' | |||||||
Match previously-named capture group letter | \g{letter} | \g{letter} | |||||||
Match expression defined in the capture group named “letter” | \g<letter> | \g<letter> | |||||||
Match expression defined in the named capture group letter | \g'letter' | \g'letter' | |||||||
Hex character YY | \xYY | \xYY | \xYY | \xYY | \xYY | \xYY | \xYY | \xYY | \xYY |
Hex character YY | \U{YY}} | ||||||||
Hex character YYYY | \x{YYYY} | \x{YYYY} | \x{YYYY} | \x{YYYY} | \x{YYYY} | \x{YYYY} | |||
Hex character YYYY | \uYYYY | \uYYYY | \uYYYY | \uYYYY | |||||
Hex character YYYY | \u{YYYY} | ||||||||
Hex character YYYYYYYY | \UYYYYYYYY | \UYYYYYYYY | |||||||
Octal character ddd | \ddd | \ddd | \0ddd | \ddd | \ddd | \ddd | \ddd | ||
Control character Y | \cY | \cY | \cY | \cY | \cY | \cY | |||
Backspace character | [\b] | [\b] | [\b] | [\b] | [\b] | ||||
Makes any character literal | \ | \ | \ | \ | \ | \ | \ | \ | |
Group Constructs | |||||||||
Match everything enclosed (non-capturing) | (?:...) | (?:...) | (?:...) | (?:...) | (?:...) | (?:...) | (?:...) | (?:...) | (?:...) |
Capture everything enclosed | (...) | (...) | (...) | (...) | (...) | (...) | (...) | (...) | (...) |
Atomic group (non-capturing) | (?>...) | (?>...) | (?>...) | (?>...) | (?>...) | ||||
Duplicate/reset subpattern group number | (?|...) | (?|...) | |||||||
Comment group | (?#...) | (?#...) | (?#...) | (?#...) | (?#...) | ||||
Named Capturing Group | (?'name'...) | (?'name'...) | (?'name'...) | ||||||
Named Capturing Group | (?<name>...) | (?<name>...) | (?<name>...) | (?<name>...) | (?<name>...) | (?<name>...) | |||
Named Capturing Group | (?P<name>...) | (?P<name>...) | (?P<name>...) | (?P<name>...) | (?P<name>...) | ||||
Balancing Named Capturing Group | (?<name1-name2>...) | ||||||||
Balancing Non-capturing Group | (?<-name2>...) | ||||||||
Balancing Named Capturing Group | (?'name1-name2'...) | ||||||||
Balancing Non-capturing Group | (?'-name2'...) | ||||||||
Inline modifiers | (?imsxn) | (?ismwx-ismwx) | (?imsxu) | (?imsxXU) | (?imsxUJnxx) | (?imsxXU) | (?imsUx) | ||
Localized inline modifiers | (?imsxn:...) | (?ismwx-ismwx:...) | (?imsxu:...) | (?imsxXU:...) | (?imsxUJnxx:...) | (?imsxXU:...) | (?imsUx:...) | ||
Conditional statement | (?(1)yes|no) | (?(1)yes|no) | (?(1)yes|no) | (?(1)yes|no) | |||||
Conditional statement | (?(R)yes|no) | (?(R)yes|no) | |||||||
Recursive Conditional statement | (?(R#)yes|no) | (?(R#)yes|no) | |||||||
Conditional statement | (?(R&name)yes|no) | (?(R&name)yes|no) | |||||||
Lookahead conditional | (?(?=...)yes|no) | (?(?=...)yes|no) | (?(?=...)yes|no) | ||||||
Lookbehind conditional | (?(?<=...)yes|no) | (?(?<=...)yes|no) | (?(?<=...)yes|no) | ||||||
Recurse entire pattern | (?R) | (?R) | |||||||
Match expression defined in capture group 1 | (?1) | (?1) | |||||||
Match expression defined in the first relative capture group | (?+1) | (?+1) | |||||||
Match expression defined in capture group name | (?&name) | (?&name) | |||||||
Match subpattern name | (?P=name) | (?P=name) | (?P=name) | ||||||
Match expression defined in the capture group “{name}” | (?P>name) | (?P>name) | |||||||
Pre-define patterns before using them | (?(DEFINE)...) | (?(DEFINE)...) | |||||||
Positive Lookahead | (?=...) | (?=...) | (?=...) | (?=...) | (?=...) | (?=...) | (?=...) | ||
Negative Lookahead | (?!...) | (?!...) | (?!...) | (?!...) | (?!...) | (?!...) | (?!...) | ||
Positive Lookbehind | (?<=...) | (?<=...) | (?<=...) | (?<=...) | (?<=...) | (?<=...) | (?<=...) | ||
Negative Lookbehind | (?<!...) | (?<!...) | (?<!...) | (?<!...) | (?<!...) | (?<!...) | (?<!...) | ||
Control verb | (*ACCEPT) | (*ACCEPT) | |||||||
Control verb | (*FAIL) | (*FAIL) | |||||||
Control verb | (*MARK:NAME) | (*MARK:NAME) | |||||||
Control verb | (*COMMIT) | (*COMMIT) | |||||||
Control verb | (*PRUNE) | (*PRUNE) | |||||||
Control verb | (*SKIP) | (*SKIP) | |||||||
Control verb | (*THEN) | (*THEN) | |||||||
Pattern modifier | (*UTF) | (*UTF) | |||||||
Pattern modifier | (*UTF8) | (*UTF8) | |||||||
Pattern modifier | (*UTF16) | (*UTF16) | |||||||
Pattern modifier | (*UTF32) | (*UTF32) | |||||||
Pattern modifier | (*UCP) | (*UCP) | |||||||
Line break modifier | (*CR) | (*CR) | |||||||
Line break modifier | (*LF) | (*LF) | |||||||
Line break modifier | (*CRLF) | (*CRLF) | |||||||
Line break modifier | (*ANYCRLF) | (*ANYCRLF) | |||||||
Line break modifier | (*ANY) | (*ANY) | |||||||
Empty match modifier | (*NOTEMPTY) | ||||||||
Empty match modifier | (*NOTEMPTY_ATSTART) | ||||||||
JIT Modifier | (*NO_JIT) | ||||||||
Line break modifier | \R | \R | \R | ||||||
Line break modifier | (*BSR_ANYCRLF) | (*BSR_ANYCRLF) | |||||||
Line break modifier | (*BSR_UNICODE) | (*BSR_UNICODE) | |||||||
Regex engine modifier | (*LIMIT_MATCH=x) | (*LIMIT_MATCH=x) | |||||||
Regex engine modifier | (*LIMIT_RECURSION=d) | (*LIMIT_RECURSION=d) | |||||||
Regex engine modifier | (*NO_AUTO_POSSESS) | (*NO_AUTO_POSSESS) | |||||||
Regex engine modifier | (*NO_START_OPT) | (*NO_START_OPT) | |||||||
Quantifiers | |||||||||
Zero or one of a | a? | a? | a? | a? | a? | a? | a? | a? | a? |
Zero or more of a | a* | a* | a* | a* | a* | a* | a* | a* | a* |
One or more of a | a+ | a+ | a+ | a+ | a+ | a+ | a+ | a+ | a+ |
Exactly 3 of a | a{3} | a{3} | a{3} | a{3} | a{3} | a{3} | a{3} | a{3} | a{3} |
3 or more of a | a{3,} | a{3,} | a{3,} | a{3,} | a{3,} | a{3,} | a{3,} | a{3,} | a{3,} |
Between 3 and 6 of a | a{3,6} | a{3,6} | a{3,6} | a{3,6} | a{3,6} | a{3,6} | a{3,6} | a{3,6} | a{3,6} |
Greedy quantifier | a* | a* | a* | a* | a* | a* | a* | a* | a* |
Lazy quantifier | a*? | a*? | a*? | a*? | a*? | a*? | a*? | a*? | |
Possessive quantifier | a*+ | a*+ | a*+ | a*+ | |||||
Anchor | |||||||||
Start of match | \G | \G | \G | \G | \G | ||||
Start of string | ^ | ^ | ^ | ^ | ^ | ^ | ^ | ^ | ^ |
End of string | $ | $ | $ | $ | $ | $ | $ | $ | $ |
Start of string | \A | \A | \A | \A | \A | \A | \A | \A | |
End of string | \Z | \Z | \Z | \Z | \Z | ||||
Absolute end of string | \z | \z | \z | \z | \z | \z | \z | ||
A word boundary | \b | \b | \b | \b | \b | \b | \b | \b | \b |
Non-word boundary | \B | \B | \B | \B | \B | \B | \B | \B | \B |
Flags | |||||||||
Global | g | g | g | g | g | g | g | g | |
Multiline | m | m | m | m | m | m | m | m | m |
Case insensitive | i | i | i | i | i | i | i | i | i |
Ignore whitespace / verbose | x | x | x | x | x | x | |||
Sticky - searches in strings only from the index of the last match | y | ||||||||
Single line | s | s | s | s | s | s | s | s | |
Unicode | u | U | u | u | u | ||||
Unicode case insensitive | u | ||||||||
Detect word boundaries using Unicode UAX 29, Text Boundaries | w | ||||||||
Restrict matches to ASCII only | a | ||||||||
eXtra | X | X | |||||||
Ungreedy | U | U | U | U | |||||
Anchor | A | A | |||||||
Duplicate group names | J | J | |||||||
Non-capturing groups | n | n | |||||||
Ignore all whitespace / verbose | xx | xx | |||||||
Substitution | |||||||||
Contents before match | $` | $` | |||||||
Contents after match | $' | $' | |||||||
Complete match contents | $& | $& | \0 | $0 | $0 | \0 | $0 | \g<0> | $0 |
Contents in capture group 1 | \1 | \1 | |||||||
Contents in capture group 1 | $1 | $1 | $1 | $1 | $1 | $1 | \g<1> | $1 | |
Insert last captured group | $+ | ||||||||
Insert entire input string | $_ | ||||||||
Contents in capture group foo | ${foo} | $<foo> | ${foo} | ${foo} | ${foo} | ${foo} | \g<foo> | ${foo} | |
Hexadecimal replacement values | \x20 | \x20 | \x20 | \x20 | \x20 | \x20 | |||
Hexadecimal replacement values | \u06fa | \u06fa | \u06fa | \u06fa | |||||
Hexadecimal replacement values | \x{06fa} | \x{06fa} | \x{06fa} | \x{06fa} | \x{06fa} | \u{06fa} | |||
Insert a tab | \t | \t | \t | \t | \t | \t | \t | \t | \t |
Insert a carriage return | \r | \r | \r | \r | \r | \r | \r | \r | \r |
Insert a newline | \n | \n | \n | \n | \n | \n | \n | \n | \n |
Insert a form-feed | \f | \f | \f | \f | \f | \f | \f | \f | |
Insert a bell | \a | \a | |||||||
Insert a null | \0 | \0 | |||||||
Insert a backspace | \b | ||||||||
Insert a vertical tab | \v | ||||||||
Insert a dollar sign | $$ | $$ | $$ | $$ | |||||
Insert the escaped literal | \[ | \[ | \[ | \[ | \[ | \[ | |||
Uppercase Transformation | \U | \U | |||||||
Lowercase Transformation | \L | \L | |||||||
Terminate any Transformation | \E | \E | |||||||
Conditional replacement | ${1:+foo:bar} |