Regex Cheat Sheet: Common Patterns & Quick Reference

· 12 min read

Table of Contents

Regular expressions are one of the most powerful tools in a developer's toolkit, yet they remain intimidating for many programmers. This comprehensive cheat sheet breaks down regex patterns into digestible sections with practical examples you can use immediately.

Whether you're validating email addresses, parsing log files, or cleaning up messy data, this guide will help you write better regex patterns faster. We'll cover everything from basic character matching to advanced lookaround assertions.

Regex Basics

Regular expressions (regex or regexp) are patterns used to match character combinations in strings. They're supported in virtually every programming language—JavaScript, Python, Java, PHP, Ruby, Go, and more—as well as text editors like VS Code, Sublime Text, and command-line tools like grep and sed.

At their core, regex patterns consist of two types of characters: literal characters that match themselves exactly, and metacharacters that have special meanings and define matching rules.

The simplest regex is a literal string. The pattern hello matches the text "hello" exactly wherever it appears. But the real power comes from metacharacters that add flexibility—like matching any digit, repeating patterns, or anchoring to specific positions.

Pro tip: Use our Regex Tester to experiment with patterns in real time. You'll see matches highlighted instantly as you type, making it much easier to understand how patterns work.

Literal Characters vs Metacharacters

Most characters in a regex pattern are literal—they match themselves. The pattern cat matches the letters c, a, and t in that exact sequence. However, certain characters have special meanings:

For example, example\.com matches "example.com" literally, while example.com would match "exampleXcom" because the unescaped dot matches any character.

Character Classes

Character classes let you match one character from a set of possibilities. They're the foundation of flexible pattern matching and come in two forms: predefined shorthand classes and custom bracket expressions.

Pattern Matches Example
. Any character except newline h.t → hat, hot, hit, h@t
\d Any digit [0-9] \d{3} → 123, 456, 789
\D Any non-digit \D+ → abc, xyz, @#$
\w Word character [a-zA-Z0-9_] \w+ → hello_world, var123
\W Non-word character \W → @, #, space, punctuation
\s Whitespace (space, tab, newline) \s+ → any whitespace sequence
\S Non-whitespace \S+ → any visible characters
[abc] Any of a, b, or c [aeiou] → any vowel
[^abc] Not a, b, or c [^0-9] → any non-digit
[a-z] Range: a through z [A-Za-z] → any letter
[a-z0-9] Multiple ranges [a-fA-F0-9] → hex digits

Custom Character Classes

Bracket expressions [] let you define your own character sets. Inside brackets, most metacharacters lose their special meaning—you don't need to escape them.

Quick tip: The order of characters in a character class doesn't matter. [abc] and [bca] are identical. The class matches if any of the characters are present.

Practical Examples

Here are some real-world uses of character classes:

Quantifiers and Repetition

Quantifiers specify how many times a pattern should repeat. They're placed after the element you want to repeat—a character, character class, or group.

Quantifier Meaning Example
* 0 or more times ab*c → ac, abc, abbc, abbbc
+ 1 or more times ab+c → abc, abbc (not ac)
? 0 or 1 time (optional) colou?r → color, colour
{n} Exactly n times \d{4} → 2026, 1999
{n,} n or more times \w{3,} → words with 3+ chars
{n,m} Between n and m times \d{2,4} → 12, 123, 1234
*? Lazy/minimal match (0 or more) <.*?> → first tag only
+? Lazy/minimal match (1 or more) ".+?" → first quoted string
?? Lazy/minimal match (0 or 1) \d?? → matches 0 digits if possible

Greedy vs Lazy Matching

This is one of the most important concepts in regex. By default, quantifiers are greedy—they match as much text as possible while still allowing the overall pattern to match.

Consider the HTML string <b>bold</b> and <i>italic</i>:

Adding ? after a quantifier makes it lazy (also called non-greedy or minimal). It matches as little text as possible while still allowing the pattern to succeed.

Pro tip: When extracting content between delimiters (quotes, tags, brackets), almost always use lazy quantifiers. The pattern ".*?" correctly extracts individual quoted strings, while ".*" would match from the first quote to the last quote in the entire text.

Common Quantifier Patterns

Here are patterns you'll use constantly:

Anchors and Boundaries

Anchors don't match characters—they match positions in the string. They're essential for ensuring patterns match at specific locations rather than anywhere in the text.

Anchor Position Example
^ Start of string (or line with m flag) ^Hello → matches "Hello world" but not "Say Hello"
$ End of string (or line with m flag) end$ → matches "The end" but not "end of story"
\b Word boundary \bcat\b → matches "cat" but not "category"
\B Not a word boundary \Bcat\B → matches "concatenate" but not "cat"
\A Start of string (never line) Like ^ but ignores multiline mode
\Z End of string (never line) Like $ but ignores multiline mode

Word Boundaries Explained

The word boundary \b is incredibly useful but often misunderstood. It matches the position between a word character (\w) and a non-word character (\W), or at the start/end of the string.

Consider the pattern \bcat\b applied to different strings:

This makes \b perfect for finding whole words without accidentally matching parts of larger words.

Start and End Anchors

The ^ and $ anchors are essential for validation. When you want to ensure an entire string matches a pattern (not just contains it), wrap your pattern with these anchors.

Quick tip: When validating user input (email, phone, username), always use ^ and $ to anchor your pattern. Without them, the pattern \d{3} would accept "abc123def" when you probably want to reject anything that isn't exactly 3 digits.

Groups and Capturing

Parentheses () serve two purposes in regex: they group parts of a pattern together, and they capture the matched text for later use. This is where regex becomes truly powerful for extraction and transformation.

Syntax Purpose Example
(abc) Capturing group (\d{3})-(\d{4}) captures area code and number
(?:abc) Non-capturing group (?:https?://)?example\.com groups without capturing
(a|b) Alternation (OR) (cat|dog) matches either "cat" or "dog"
\1 Backreference to group 1 (\w+)\s+\1 matches repeated words like "the the"
(?<name>abc) Named capturing group (?<year>\d{4})-(?<month>\d{2}) for dates

Capturing Groups

When you wrap part of a pattern in parentheses, the regex engine captures the matched text. You can then reference these captures in your code or even within the regex itself using backreferences.

For example, the pattern (\d{3})-(\d{3})-(\d{4}) applied to "555-123-4567" creates three captures:

In most programming languages, you can access these captures through match objects or replacement strings. This lets you reformat data easily—turning "555-123-4567" into "(555) 123-4567" with a replacement like ($1) $2-$3.

Non-Capturing Groups

Sometimes you need grouping for quantifiers or alternation but don't need to capture the text. Use (?:...) for better performance and cleaner capture numbering.

Compare these patterns:

The second pattern is more efficient and makes your captures easier to work with since you don't have to skip over groups you don't care about.

Alternation and OR Logic

The pipe | inside a group creates an OR condition. The pattern (cat|dog|bird) matches any of those three words.

Important: alternation is left-to-right and stops at the first match. The pattern (cat|category) will never match "category" because "cat" matches first. Put longer alternatives first: (category|cat).

Pro tip: Use our Text Replacer tool to test regex replacements with capturing groups. You can see exactly how your captures are being used in the replacement string.

Backreferences

Backreferences let you match the same text that was captured earlier in the pattern. This is perfect for finding repeated words, matching paired delimiters, or validating consistent formatting.

Lookahead and Lookbehind

Lookaround assertions are zero-width—they match a position without consuming characters. They let you check what comes before or after a position without including it in the match.

Syntax Type Meaning
(?=...) Positive lookahead Matches if followed by pattern
(?!...) Negative lookahead Matches if NOT followed by pattern
(?<=...) Positive lookbehind Matches if preceded by pattern
(?<!...) Negative lookbehind Matches if NOT preceded by pattern

Lookahead Examples

Positive lookahead (?=...) checks that a pattern follows the current position:

Negative lookahead (?!...) checks that a pattern does NOT follow:

Lookbehind Examples

Positive lookbehind (?<=...) checks that a pattern precedes the current position:

Negative lookbehind (?<!...) checks that a pattern does NOT precede:

Quick tip: Lookaround assertions are powerful but can be confusing. Remember: they check conditions without moving the match position forward. Think of them as "peek ahead" or "peek behind" operations.

Password Validation with Lookahead

One of the most practical uses of lookahead is password validation. You can check multiple requirements without complex logic:

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$

This pattern ensures:

Regex Flags and Modifiers

Flags modify how the regex engine interprets your pattern. They're typically added after the closing delimiter in languages like JavaScript (/pattern/flags) or as parameters in function calls.

Flag Name Effect
i Case-insensitive Makes pattern match regardless of case
g Global Finds all matches, not just the first
m Multiline Makes ^ and $ match line boundaries
s Dotall Makes . match newlines too
u Unicode Enables full Unicode support
x Extended Allows whitespace and comments in pattern

Case-Insensitive Flag (i)

The i flag makes your pattern match regardless of letter case. Without it, hello only matches "hello" exactly. With it, the pattern matches "hello", "Hello", "HELLO", "HeLLo", etc.

This is essential for user input validation where you don't want to force specific capitalization. For example, /^yes$/i accepts "yes", "Yes", "YES", or any other case variation.

Global Flag (g)

By default, regex engines stop after finding the first match. The g flag tells the engine to find all matches in the string.

This is crucial for operations like "find and replace all" or counting occurrences. In JavaScript, string.match(/\d+/g) returns an array of all number sequences, while without g it returns only the first match.

Multiline Flag (m)

Normally, ^ and $ match the start and end of the entire string. With the m flag, they match the start and end of each line within the string.

This is useful for processing multi-line text like log files or CSV data. The pattern ^ERROR/m matches "ERROR" at the start of any line, not just the first line of the file.

Dotall Flag (s)

By default, the dot . matches any character except newlines. The s flag (also called "single-line" mode, confusingly) makes the dot match newlines too.

This is helpful when matching content that spans multiple lines, like HTML tags or multi-line comments: <div>.*?</div> with the s flag matches div elements even if they contain line breaks.

Common Patterns Library

Here's a collection of battle-tested regex patterns for common validation and extraction tasks. These patterns balance accuracy with practicality—perfect for most use cases.

Email Addresses

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

This pattern handles most valid email addresses. It requires a local part (before @), a domain name, and a TLD. Note that fully RFC-compliant email validation is extremely complex—this pattern covers 99% of real-world cases.

URLs

^https?://[^\s/$.?#].[^\s]*$

Matches HTTP and HTTPS URLs. For more permissive matching (optional protocol), use:

^(?:https?://)?(?:www\.)?[a-zA-Z0-9-]+\.[a-zA-Z]{2,}(?:/[^\s]*)?$

Phone Numbers

^\+?1?\s*\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})$