What does \b mean in regex?

\b is a word boundary anchor matching the position between a word and non-word character.

What is the difference between .* and .*?

.* is greedy (matches as much as possible), .*? is lazy (matches as little as possible).

How do I match a literal dot?

Escape it with backslash: \. Without escaping, dot matches any character.

What are non-capturing groups?

(?:...) groups patterns without creating a backreference, useful for alternation or quantifiers.

How do I validate an email with regex?

Use: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

Regex Cheat Sheet: Common Patterns & Quick Reference

March 31, 2026 · 12 min read

Table of Contents

Regex Basics
Character Classes
Quantifiers and Repetition
Anchors and Boundaries
Groups and Capturing
Lookahead and Lookbehind
Regex Flags and Modifiers
Common Patterns Library
Regex in Different Languages
Best Practices and Performance
Frequently Asked Questions
Related Articles

Regular expressions are one of the most powerful tools in a developer's toolkit, yet they remain intimidating for many programmers. This comprehensive cheat sheet breaks down regex patterns into digestible sections with practical examples you can use immediately.

Whether you're validating email addresses, parsing log files, or cleaning up messy data, this guide will help you write better regex patterns faster. We'll cover everything from basic character matching to advanced lookaround assertions.

Regex Basics

Regular expressions (regex or regexp) are patterns used to match character combinations in strings. They're supported in virtually every programming language—JavaScript, Python, Java, PHP, Ruby, Go, and more—as well as text editors like VS Code, Sublime Text, and command-line tools like grep and sed.

At their core, regex patterns consist of two types of characters: literal characters that match themselves exactly, and metacharacters that have special meanings and define matching rules.

The simplest regex is a literal string. The pattern hello matches the text "hello" exactly wherever it appears. But the real power comes from metacharacters that add flexibility—like matching any digit, repeating patterns, or anchoring to specific positions.

Pro tip: Use our Regex Tester to experiment with patterns in real time. You'll see matches highlighted instantly as you type, making it much easier to understand how patterns work.

Literal Characters vs Metacharacters

Most characters in a regex pattern are literal—they match themselves. The pattern cat matches the letters c, a, and t in that exact sequence. However, certain characters have special meanings:

. ^ $ * + ? { } [ ] \ | ( ) are metacharacters
To match these literally, escape them with a backslash: \. matches a period
Inside character classes [], most metacharacters lose their special meaning

For example, example\.com matches "example.com" literally, while example.com would match "exampleXcom" because the unescaped dot matches any character.

Character Classes

Character classes let you match one character from a set of possibilities. They're the foundation of flexible pattern matching and come in two forms: predefined shorthand classes and custom bracket expressions.

Pattern	Matches	Example
`.`	Any character except newline	`h.t` → hat, hot, hit, h@t
`\d`	Any digit [0-9]	`\d{3}` → 123, 456, 789
`\D`	Any non-digit	`\D+` → abc, xyz, @#$
`\w`	Word character [a-zA-Z0-9_]	`\w+` → hello_world, var123
`\W`	Non-word character	`\W` → @, #, space, punctuation
`\s`	Whitespace (space, tab, newline)	`\s+` → any whitespace sequence
`\S`	Non-whitespace	`\S+` → any visible characters
`[abc]`	Any of a, b, or c	`[aeiou]` → any vowel
`[^abc]`	Not a, b, or c	`[^0-9]` → any non-digit
`[a-z]`	Range: a through z	`[A-Za-z]` → any letter
`[a-z0-9]`	Multiple ranges	`[a-fA-F0-9]` → hex digits

Custom Character Classes

Bracket expressions [] let you define your own character sets. Inside brackets, most metacharacters lose their special meaning—you don't need to escape them.

[aeiou] matches any single vowel
[0-9] matches any digit (equivalent to \d)
[a-zA-Z] matches any letter, uppercase or lowercase
[^0-9] matches anything except digits (the ^ negates the class)
[a-z-] matches lowercase letters or a hyphen (hyphen at the end is literal)

Quick tip: The order of characters in a character class doesn't matter. [abc] and [bca] are identical. The class matches if any of the characters are present.

Practical Examples

Here are some real-world uses of character classes:

[A-Z][a-z]+ matches capitalized words like "Hello" or "World"
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} matches IP addresses (basic pattern)
[a-fA-F0-9]{6} matches hex color codes like "FF5733"
[^\s]+ matches any sequence of non-whitespace (a "word" in the broadest sense)

Quantifiers and Repetition

Quantifiers specify how many times a pattern should repeat. They're placed after the element you want to repeat—a character, character class, or group.

Quantifier	Meaning	Example
`*`	0 or more times	`ab*c` → ac, abc, abbc, abbbc
`+`	1 or more times	`ab+c` → abc, abbc (not ac)
`?`	0 or 1 time (optional)	`colou?r` → color, colour
`{n}`	Exactly n times	`\d{4}` → 2026, 1999
`{n,}`	n or more times	`\w{3,}` → words with 3+ chars
`{n,m}`	Between n and m times	`\d{2,4}` → 12, 123, 1234
`*?`	Lazy/minimal match (0 or more)	`<.*?>` → first tag only
`+?`	Lazy/minimal match (1 or more)	`".+?"` → first quoted string
`??`	Lazy/minimal match (0 or 1)	`\d??` → matches 0 digits if possible

Greedy vs Lazy Matching

This is one of the most important concepts in regex. By default, quantifiers are greedy—they match as much text as possible while still allowing the overall pattern to match.

Consider the HTML string <b>bold</b> and <i>italic</i>:

<.*> (greedy) matches the entire string from the first < to the last >
<.*?> (lazy) matches just <b>, then </b>, then <i>, then </i> separately

Adding ? after a quantifier makes it lazy (also called non-greedy or minimal). It matches as little text as possible while still allowing the pattern to succeed.

Pro tip: When extracting content between delimiters (quotes, tags, brackets), almost always use lazy quantifiers. The pattern ".*?" correctly extracts individual quoted strings, while ".*" would match from the first quote to the last quote in the entire text.

Common Quantifier Patterns

Here are patterns you'll use constantly:

\d+ matches one or more digits (numbers like 42, 1000, 7)
\w+ matches one or more word characters (identifiers, variable names)
\s* matches optional whitespace (zero or more spaces/tabs)
.+? matches any characters lazily (content between markers)
[a-z]{2,} matches words with at least 2 lowercase letters
\d{3}-\d{3}-\d{4} matches phone numbers like 555-123-4567

Anchors and Boundaries

Anchors don't match characters—they match positions in the string. They're essential for ensuring patterns match at specific locations rather than anywhere in the text.

Anchor	Position	Example
`^`	Start of string (or line with m flag)	`^Hello` → matches "Hello world" but not "Say Hello"
`$`	End of string (or line with m flag)	`end$` → matches "The end" but not "end of story"
`\b`	Word boundary	`\bcat\b` → matches "cat" but not "category"
`\B`	Not a word boundary	`\Bcat\B` → matches "concatenate" but not "cat"
`\A`	Start of string (never line)	Like `^` but ignores multiline mode
`\Z`	End of string (never line)	Like `$` but ignores multiline mode

Word Boundaries Explained

The word boundary \b is incredibly useful but often misunderstood. It matches the position between a word character (\w) and a non-word character (\W), or at the start/end of the string.

Consider the pattern \bcat\b applied to different strings:

"the cat sat" → matches (cat is surrounded by spaces)
"category" → no match (cat is followed by word character 'e')
"concatenate" → no match (cat is preceded and followed by word characters)
"cat" → matches (cat is at start and end of string)
"cat!" → matches (cat is followed by punctuation, a non-word character)

This makes \b perfect for finding whole words without accidentally matching parts of larger words.

Start and End Anchors

The ^ and $ anchors are essential for validation. When you want to ensure an entire string matches a pattern (not just contains it), wrap your pattern with these anchors.

^\d+$ ensures the entire string is digits (validates numeric input)
^[A-Z] ensures the string starts with an uppercase letter
[.!?]$ ensures the string ends with punctuation
^https?:// ensures a URL starts with http:// or https://

Quick tip: When validating user input (email, phone, username), always use ^ and $ to anchor your pattern. Without them, the pattern \d{3} would accept "abc123def" when you probably want to reject anything that isn't exactly 3 digits.

Groups and Capturing

Parentheses () serve two purposes in regex: they group parts of a pattern together, and they capture the matched text for later use. This is where regex becomes truly powerful for extraction and transformation.

Syntax	Purpose	Example
`(abc)`	Capturing group	`(\d{3})-(\d{4})` captures area code and number
`(?:abc)`	Non-capturing group	`(?:https?://)?example\.com` groups without capturing
`(a\|b)`	Alternation (OR)	`(cat\|dog)` matches either "cat" or "dog"
`\1`	Backreference to group 1	`(\w+)\s+\1` matches repeated words like "the the"
`(?<name>abc)`	Named capturing group	`(?<year>\d{4})-(?<month>\d{2})` for dates

Capturing Groups

When you wrap part of a pattern in parentheses, the regex engine captures the matched text. You can then reference these captures in your code or even within the regex itself using backreferences.

For example, the pattern (\d{3})-(\d{3})-(\d{4}) applied to "555-123-4567" creates three captures:

Group 1: "555"
Group 2: "123"
Group 3: "4567"

In most programming languages, you can access these captures through match objects or replacement strings. This lets you reformat data easily—turning "555-123-4567" into "(555) 123-4567" with a replacement like ($1) $2-$3.

Non-Capturing Groups

Sometimes you need grouping for quantifiers or alternation but don't need to capture the text. Use (?:...) for better performance and cleaner capture numbering.

Compare these patterns:

(https?)://([\w.]+) creates two captures: protocol and domain
(?:https?)://([\w.]+) creates one capture: just the domain

The second pattern is more efficient and makes your captures easier to work with since you don't have to skip over groups you don't care about.

Alternation and OR Logic

The pipe | inside a group creates an OR condition. The pattern (cat|dog|bird) matches any of those three words.

Important: alternation is left-to-right and stops at the first match. The pattern (cat|category) will never match "category" because "cat" matches first. Put longer alternatives first: (category|cat).

Pro tip: Use our Text Replacer tool to test regex replacements with capturing groups. You can see exactly how your captures are being used in the replacement string.

Backreferences

Backreferences let you match the same text that was captured earlier in the pattern. This is perfect for finding repeated words, matching paired delimiters, or validating consistent formatting.

(\w+)\s+\1 matches repeated words like "the the" or "hello hello"
(['"])(.*?)\1 matches quoted strings with matching quotes
<(\w+)>.*?</\1> matches HTML tags with matching open/close tags

Lookahead and Lookbehind

Lookaround assertions are zero-width—they match a position without consuming characters. They let you check what comes before or after a position without including it in the match.

Syntax	Type	Meaning
`(?=...)`	Positive lookahead	Matches if followed by pattern
`(?!...)`	Negative lookahead	Matches if NOT followed by pattern
`(?<=...)`	Positive lookbehind	Matches if preceded by pattern
`(?<!...)`	Negative lookbehind	Matches if NOT preceded by pattern

Lookahead Examples

Positive lookahead (?=...) checks that a pattern follows the current position:

\d+(?= dollars) matches numbers followed by " dollars" but doesn't include " dollars" in the match
password(?=.*\d)(?=.*[A-Z]) validates that a password contains at least one digit and one uppercase letter

Negative lookahead (?!...) checks that a pattern does NOT follow:

\d+(?! dollars) matches numbers NOT followed by " dollars"
^(?!.*password).*$ matches strings that don't contain "password"

Lookbehind Examples

Positive lookbehind (?<=...) checks that a pattern precedes the current position:

(?<=\$)\d+ matches numbers preceded by a dollar sign, without including the $
(?<=@)\w+ matches usernames after @ symbols

Negative lookbehind (?<!...) checks that a pattern does NOT precede:

(?<!un)happy matches "happy" but not "unhappy"
(?<!//.*)\bfunction\b matches "function" keyword not in comments

Quick tip: Lookaround assertions are powerful but can be confusing. Remember: they check conditions without moving the match position forward. Think of them as "peek ahead" or "peek behind" operations.

Password Validation with Lookahead

One of the most practical uses of lookahead is password validation. You can check multiple requirements without complex logic:

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$

This pattern ensures:

At least one lowercase letter (?=.*[a-z])
At least one uppercase letter (?=.*[A-Z])
At least one digit (?=.*\d)
At least one special character (?=.*[@$!%*?&])
Minimum 8 characters {8,}

Regex Flags and Modifiers

Flags modify how the regex engine interprets your pattern. They're typically added after the closing delimiter in languages like JavaScript (/pattern/flags) or as parameters in function calls.

Flag	Name	Effect
`i`	Case-insensitive	Makes pattern match regardless of case
`g`	Global	Finds all matches, not just the first
`m`	Multiline	Makes ^ and $ match line boundaries
`s`	Dotall	Makes . match newlines too
`u`	Unicode	Enables full Unicode support
`x`	Extended	Allows whitespace and comments in pattern

Case-Insensitive Flag (i)

The i flag makes your pattern match regardless of letter case. Without it, hello only matches "hello" exactly. With it, the pattern matches "hello", "Hello", "HELLO", "HeLLo", etc.

This is essential for user input validation where you don't want to force specific capitalization. For example, /^yes$/i accepts "yes", "Yes", "YES", or any other case variation.

Global Flag (g)

By default, regex engines stop after finding the first match. The g flag tells the engine to find all matches in the string.

This is crucial for operations like "find and replace all" or counting occurrences. In JavaScript, string.match(/\d+/g) returns an array of all number sequences, while without g it returns only the first match.

Multiline Flag (m)

Normally, ^ and $ match the start and end of the entire string. With the m flag, they match the start and end of each line within the string.

This is useful for processing multi-line text like log files or CSV data. The pattern ^ERROR/m matches "ERROR" at the start of any line, not just the first line of the file.

Dotall Flag (s)

By default, the dot . matches any character except newlines. The s flag (also called "single-line" mode, confusingly) makes the dot match newlines too.

This is helpful when matching content that spans multiple lines, like HTML tags or multi-line comments: <div>.*?</div> with the s flag matches div elements even if they contain line breaks.

Common Patterns Library

Here's a collection of battle-tested regex patterns for common validation and extraction tasks. These patterns balance accuracy with practicality—perfect for most use cases.

Email Addresses

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

This pattern handles most valid email addresses. It requires a local part (before @), a domain name, and a TLD. Note that fully RFC-compliant email validation is extremely complex—this pattern covers 99% of real-world cases.

URLs

^https?://[^\s/$.?#].[^\s]*$

Matches HTTP and HTTPS URLs. For more permissive matching (optional protocol), use:

^(?:https?://)?(?:www\.)?[a-zA-Z0-9-]+\.[a-zA-Z]{2,}(?:/[^\s]*)?$

Phone Numbers

^\+?1?\s*\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})$