Why does my regex work in one tool but not another?

Different regex engines have different feature sets. JavaScript does not support lookbehind in older environments. Python uses re.match (anchored at start) vs re.search (anywhere). Backslash escaping differs between strings and regex literals. Use our Regex Tester tool to test against specific patterns in a consistent environment.

What does the ^ anchor mean inside square brackets?

Inside a character class [^...], the caret ^ negates the class — it matches any character NOT in the set. For example, [^aeiou] matches any non-vowel character. Outside square brackets, ^ anchors the match to the start of the string (or line with the m flag).

Regex Cheat Sheet: Complete Regular Expressions Reference

1. Character Classes

Character classes match one character from a defined set.

Pattern	Description	Example match
[abc]	One of: a, b, or c	a, b, c
[^abc]	Any character except a, b, c	d, 1, @
[a-z]	Any lowercase letter	a, m, z
[A-Z]	Any uppercase letter	A, M, Z
[0-9]	Any digit	0, 5, 9
[a-zA-Z0-9]	Any alphanumeric character	a, Z, 7
\d	Any digit (same as [0-9])	0–9
\D	Any non-digit	a, !, space
\w	Word character [a-zA-Z0-9_]	a, 9, _
\W	Non-word character	!, space, @
\s	Whitespace (space, tab, newline)	, \t, \n
\S	Non-whitespace	a, 1, @
.	Any character except newline	a, !, 5

2. Anchors

Anchors do not match characters — they match positions.

Pattern	Description
^	Start of string (or line with `m` flag)
$	End of string (or line with `m` flag)
\b	Word boundary — between a \w and \W character
\B	Non-word boundary
\A	Start of string (Python) — `^` in most others
\Z	End of string (Python) — ignores final newline

// Match "cat" only as a whole word
\bcat\b   → matches "my cat sat" but NOT "catfish" or "concat"

// Entire string must be digits
^\d+$     → matches "12345" but NOT "123abc"

3. Quantifiers

Quantifiers specify how many times the preceding element should match.

Pattern	Description
*	0 or more (greedy)
+	1 or more (greedy)
?	0 or 1 (optional)
{n}	Exactly n times
{n,}	n or more times
{n,m}	Between n and m times (inclusive)
*?	0 or more (lazy — as few as possible)
+?	1 or more (lazy)
??	0 or 1 (lazy)

Greedy vs Lazy Example

Input: "<b>hello</b> and <b>world</b>"

Greedy:  <b>.*</b>   → matches "<b>hello</b> and <b>world</b>" (one match)
Lazy:    <b>.*?</b>  → matches "<b>hello</b>" and "<b>world</b>" (two matches)

4. Groups and Backreferences

Pattern	Description
(abc)	Capturing group — captures “abc”
(?:abc)	Non-capturing group — groups without capturing
(?<name>abc)	Named capturing group
\1, \2 ...	Backreference to captured group 1, 2, ...
a\|b	Alternation — match “a” or “b”

// Capture year, month, day from a date
(\d{4})-(\d{2})-(\d{2})
// Group 1 → year, Group 2 → month, Group 3 → day

// Named groups (more readable)
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})

// Backreference — match doubled words
\b(\w+)\s+\1\b   → matches "the the", "is is"

5. Lookahead and Lookbehind

Lookaround assertions check what comes before or after the current position without consuming characters.

Pattern	Type	Description
(?=...)	Positive lookahead	Matches if followed by ...
(?!...)	Negative lookahead	Matches if NOT followed by ...
(?<=...)	Positive lookbehind	Matches if preceded by ...
(?<!...)	Negative lookbehind	Matches if NOT preceded by ...

// Positive lookahead — match "foo" only when followed by "bar"
foo(?=bar) → matches "foo" in "foobar" but not in "fooqaz"

// Negative lookahead — match digits not followed by "px"
\d+(?!px)  → matches "100" in "100em" but not in "100px"

// Positive lookbehind — match digits preceded by "$"
(?<=\$)\d+ → matches "99" in "$99" but not in "99"

// Useful: password must contain uppercase letter
^(?=.*[A-Z]).{8,}$   → 8+ chars with at least one uppercase

6. Flags / Modifiers

Flag	Name	Effect
i	Case-insensitive	`[a-z]` also matches uppercase
g	Global	Find all matches, not just the first
m	Multiline	`^` and `$` match start/end of each line
s	Dotall	`.` also matches newline characters
x	Verbose (extended)	Allow whitespace and comments in pattern (PCRE/Python)
u	Unicode	Enable full Unicode matching (JavaScript)

// JavaScript
/hello/i.test('Hello World')  // true — case-insensitive
'a1b2c3'.match(/\d/g)         // ['1', '2', '3'] — global

// Python
import re
re.findall(r'\d+', 'a1b22c333', re.IGNORECASE)  # ['1', '22', '333']

// Verbose mode (Python) — add comments to complex regex
pattern = re.compile(r"""
  (?P<year>\d{4})    # 4-digit year
  -
  (?P<month>\d{2})   # 2-digit month
  -
  (?P<day>\d{2})     # 2-digit day
""", re.VERBOSE)

7. Special Characters & Escaping

The following characters have special meaning in regex and must be escaped with a backslash when you want to match them literally:

. * + ? ^ $ { } [ ] | ( ) \

// To match a literal dot:
\.    → matches "." in "3.14" but not any character

// To match a literal backslash:
\\

8. Practical Patterns

Copy-paste patterns for common validation tasks. Always test and adapt for your specific requirements.

Email Address (basic)

^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$

URL (http & https)

^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&\/=]*)$

IPv4 Address

^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$

Date (YYYY-MM-DD)

^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$

Phone Number (US, flexible)

^(\+1[\-\s]?)?\(?\d{3}\)?[\-\s]?\d{3}[\-\s]?\d{4}$

Hex Color Code

^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$

Strong Password

// Min 8 chars, at least one uppercase, one lowercase, one digit, one special char
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$

UUID (v4)

^[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$

Credit Card Number (basic format)

^(?:4[0-9]{12}(?:[0-9]{3})?         # Visa
   |5[1-5][0-9]{14}                  # Mastercard
   |3[47][0-9]{13}                   # Amex
   |6(?:011|5[0-9]{2})[0-9]{12})$    # Discover

Slug (URL-friendly string)

^[a-z0-9]+(?:-[a-z0-9]+)*$

HTML Tag (do not use to parse full HTML)

<([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>(.*?)<\/\1>

Warning: Do not use regex to parse HTML or XML. Use a proper DOM parser. Regex cannot correctly handle nested tags, self-closing tags, or malformed markup.

9. Common Mistakes & Performance Tips

1. Catastrophic Backtracking (ReDoS)

Avoid nested quantifiers on overlapping character classes: (a+)+, (\w|\w)+. These cause exponential time complexity on non-matching input, enabling denial-of-service attacks.

2. Forgetting to Anchor

Without ^ and $, your pattern matches substrings. \d{5} matches the 5 digits inside "abc12345xyz". Use ^\d{5}$ to match only strings that are exactly 5 digits.

3. Dot Matches Newline Confusion

The . metacharacter does NOT match newline characters by default. Use the s (dotall) flag or [\s\S] to match any character including newlines.

4. Overcomplicated Patterns

A regex that is hard to read is hard to maintain. Prefer multiple simple validations over one monster expression. Use verbose mode (x flag) with comments in Python and PHP when your pattern exceeds 40 characters.

5. Not Handling Unicode

\w in JavaScript only matches ASCII word characters by default. For international text, use the u flag and Unicode character properties: /\p{L}+/u matches any Unicode letter.

Frequently Asked Questions

Are regular expressions the same in all languages?

Core syntax is consistent across PCRE-based engines (PHP, Python, Perl, JavaScript, Java, .NET). Differences appear in lookbehind support, Unicode handling, possessive quantifiers, and escape sequences. Always test in the target language.

What is the difference between greedy and lazy quantifiers?

Greedy (*, +, {n,m}) match as much as possible. Lazy (*?, +?, {n,m}?) match as little as possible. Given hello and world: greedy .* returns one match, lazy .*? returns two.

What is catastrophic backtracking?

Nested quantifiers on overlapping character classes (e.g. (a+)+$) cause the engine to try exponentially many combinations on non-matching input. A malicious input can cause a ReDoS (Regex Denial of Service) attack. Always test with adversarial inputs.

What does ^ mean inside square brackets?

Inside [^...], the caret negates the class — it matches any character NOT in the set. [^aeiou] matches any non-vowel. Outside square brackets, ^ anchors the match to the start of the string.

Should I use regex to validate email addresses?

Regex can catch obvious format errors but cannot truly validate email. Use it for basic sanity checking, then verify deliverability with our Email Validator or by sending a confirmation email.