Regex Cheat Sheet: Complete Regular Expressions Reference
Regular expressions (regex) are one of the most powerful — and frustrating — tools in a developer's
toolkit. This cheat sheet covers every essential syntax element with examples, from basic character
matching to advanced lookaheads and practical validation patterns.
Use our interactive
Regex Tester to experiment with any pattern from
this guide in real time.
1. Character Classes
Character classes match one character from a defined set.
| Pattern | Description | Example match |
| [abc] | One of: a, b, or c | a, b, c |
| [^abc] | Any character except a, b, c | d, 1, @ |
| [a-z] | Any lowercase letter | a, m, z |
| [A-Z] | Any uppercase letter | A, M, Z |
| [0-9] | Any digit | 0, 5, 9 |
| [a-zA-Z0-9] | Any alphanumeric character | a, Z, 7 |
| \d | Any digit (same as [0-9]) | 0–9 |
| \D | Any non-digit | a, !, space |
| \w | Word character [a-zA-Z0-9_] | a, 9, _ |
| \W | Non-word character | !, space, @ |
| \s | Whitespace (space, tab, newline) | , \t, \n |
| \S | Non-whitespace | a, 1, @ |
| . | Any character except newline | a, !, 5 |
2. Anchors
Anchors do not match characters — they match positions.
| Pattern | Description |
| ^ | Start of string (or line with m flag) |
| $ | End of string (or line with m flag) |
| \b | Word boundary — between a \w and \W character |
| \B | Non-word boundary |
| \A | Start of string (Python) — ^ in most others |
| \Z | End of string (Python) — ignores final newline |
// Match "cat" only as a whole word
\bcat\b → matches "my cat sat" but NOT "catfish" or "concat"
// Entire string must be digits
^\d+$ → matches "12345" but NOT "123abc"
3. Quantifiers
Quantifiers specify how many times the preceding element should match.
| Pattern | Description |
| * | 0 or more (greedy) |
| + | 1 or more (greedy) |
| ? | 0 or 1 (optional) |
| {n} | Exactly n times |
| {n,} | n or more times |
| {n,m} | Between n and m times (inclusive) |
| *? | 0 or more (lazy — as few as possible) |
| +? | 1 or more (lazy) |
| ?? | 0 or 1 (lazy) |
Greedy vs Lazy Example
Input: "<b>hello</b> and <b>world</b>"
Greedy: <b>.*</b> → matches "<b>hello</b> and <b>world</b>" (one match)
Lazy: <b>.*?</b> → matches "<b>hello</b>" and "<b>world</b>" (two matches)
4. Groups and Backreferences
| Pattern | Description |
| (abc) | Capturing group — captures “abc” |
| (?:abc) | Non-capturing group — groups without capturing |
| (?<name>abc) | Named capturing group |
| \1, \2 ... | Backreference to captured group 1, 2, ... |
| a|b | Alternation — match “a” or “b” |
// Capture year, month, day from a date
(\d{4})-(\d{2})-(\d{2})
// Group 1 → year, Group 2 → month, Group 3 → day
// Named groups (more readable)
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})
// Backreference — match doubled words
\b(\w+)\s+\1\b → matches "the the", "is is"
5. Lookahead and Lookbehind
Lookaround assertions check what comes before or after the current position
without consuming characters.
| Pattern | Type | Description |
| (?=...) | Positive lookahead | Matches if followed by ... |
| (?!...) | Negative lookahead | Matches if NOT followed by ... |
| (?<=...) | Positive lookbehind | Matches if preceded by ... |
| (?<!...) | Negative lookbehind | Matches if NOT preceded by ... |
// Positive lookahead — match "foo" only when followed by "bar"
foo(?=bar) → matches "foo" in "foobar" but not in "fooqaz"
// Negative lookahead — match digits not followed by "px"
\d+(?!px) → matches "100" in "100em" but not in "100px"
// Positive lookbehind — match digits preceded by "$"
(?<=\$)\d+ → matches "99" in "$99" but not in "99"
// Useful: password must contain uppercase letter
^(?=.*[A-Z]).{8,}$ → 8+ chars with at least one uppercase
6. Flags / Modifiers
| Flag | Name | Effect |
| i | Case-insensitive | [a-z] also matches uppercase |
| g | Global | Find all matches, not just the first |
| m | Multiline | ^ and $ match start/end of each line |
| s | Dotall | . also matches newline characters |
| x | Verbose (extended) | Allow whitespace and comments in pattern (PCRE/Python) |
| u | Unicode | Enable full Unicode matching (JavaScript) |
// JavaScript
/hello/i.test('Hello World') // true — case-insensitive
'a1b2c3'.match(/\d/g) // ['1', '2', '3'] — global
// Python
import re
re.findall(r'\d+', 'a1b22c333', re.IGNORECASE) # ['1', '22', '333']
// Verbose mode (Python) — add comments to complex regex
pattern = re.compile(r"""
(?P<year>\d{4}) # 4-digit year
-
(?P<month>\d{2}) # 2-digit month
-
(?P<day>\d{2}) # 2-digit day
""", re.VERBOSE)
7. Special Characters & Escaping
The following characters have special meaning in regex and must be escaped with a backslash
when you want to match them literally:
. * + ? ^ $ { } [ ] | ( ) \
// To match a literal dot:
\. → matches "." in "3.14" but not any character
// To match a literal backslash:
\\
8. Practical Patterns
Copy-paste patterns for common validation tasks. Always test and adapt for your specific requirements.
Email Address (basic)
^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$
URL (http & https)
^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&\/=]*)$
IPv4 Address
^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
Date (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Phone Number (US, flexible)
^(\+1[\-\s]?)?\(?\d{3}\)?[\-\s]?\d{3}[\-\s]?\d{4}$
Hex Color Code
^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$
Strong Password
// Min 8 chars, at least one uppercase, one lowercase, one digit, one special char
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
UUID (v4)
^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$
Credit Card Number (basic format)
^(?:4[0-9]{12}(?:[0-9]{3})? # Visa
|5[1-5][0-9]{14} # Mastercard
|3[47][0-9]{13} # Amex
|6(?:011|5[0-9]{2})[0-9]{12})$ # Discover
Slug (URL-friendly string)
^[a-z0-9]+(?:-[a-z0-9]+)*$
HTML Tag (do not use to parse full HTML)
<([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>(.*?)<\/\1>
Warning: Do not use regex to parse HTML or XML. Use a proper DOM parser.
Regex cannot correctly handle nested tags, self-closing tags, or malformed markup.
9. Common Mistakes & Performance Tips
1. Catastrophic Backtracking (ReDoS)
Avoid nested quantifiers on overlapping character classes: (a+)+, (\w|\w)+.
These cause exponential time complexity on non-matching input, enabling denial-of-service attacks.
2. Forgetting to Anchor
Without ^ and $, your pattern matches substrings.
\d{5} matches the 5 digits inside "abc12345xyz".
Use ^\d{5}$ to match only strings that are exactly 5 digits.
3. Dot Matches Newline Confusion
The . metacharacter does NOT match newline characters by default.
Use the s (dotall) flag or [\s\S] to match any character including newlines.
4. Overcomplicated Patterns
A regex that is hard to read is hard to maintain. Prefer multiple simple validations over
one monster expression. Use verbose mode (x flag) with comments in Python
and PHP when your pattern exceeds 40 characters.
5. Not Handling Unicode
\w in JavaScript only matches ASCII word characters by default.
For international text, use the u flag and Unicode character properties:
/\p{L}+/u matches any Unicode letter.
Frequently Asked Questions
Are regular expressions the same in all languages?
Core syntax is consistent across PCRE-based engines (PHP, Python, Perl, JavaScript, Java, .NET).
Differences appear in lookbehind support, Unicode handling, possessive quantifiers, and escape
sequences. Always test in the target language.
What is the difference between greedy and lazy quantifiers?
Greedy (*, +, {n,m}) match as much as possible.
Lazy (*?, +?, {n,m}?) match as little as possible.
Given <b>hello</b> and <b>world</b>:
greedy <b>.*</b> returns one match, lazy <b>.*?</b> returns two.
What is catastrophic backtracking?
Nested quantifiers on overlapping character classes (e.g. (a+)+$) cause the engine
to try exponentially many combinations on non-matching input. A malicious input can cause a
ReDoS (Regex Denial of Service) attack. Always test with adversarial inputs.
What does ^ mean inside square brackets?
Inside [^...], the caret negates the class — it matches any character
NOT in the set. [^aeiou] matches any non-vowel. Outside square brackets,
^ anchors the match to the start of the string.
Should I use regex to validate email addresses?
Regex can catch obvious format errors but cannot truly validate email. Use it for basic sanity
checking, then verify deliverability with our Email Validator
or by sending a confirmation email.