docs/concepts/regular-expressions.mdx
How do you check if an email address is valid? How do you find and replace all phone numbers in a document? How can you extract hashtags from a tweet?
// Check if a string contains only digits
const isAllDigits = /^\d+$/.test('12345')
console.log(isAllDigits) // true
// Find all words starting with capital letters
const text = 'Hello World from JavaScript'
const capitalWords = text.match(/\b[A-Z][a-z]*\b/g)
console.log(capitalWords) // ["Hello", "World"]
The answer is regular expressions (often called "regex" or "regexp"). They're patterns that describe what you're looking for in text, and JavaScript has powerful built-in support for them.
<Info> **What you'll learn in this guide:** - Creating regex with literals (`/pattern/`) and the `RegExp` constructor - Character classes, quantifiers, and anchors - Key methods: `test()`, `match()`, `replace()`, `split()` - Capturing groups for extracting parts of matches - Flags that change how patterns match - Common real-world patterns (email, phone, URL) </Info> <Warning> **Prerequisite:** This guide assumes you're comfortable with [strings](/concepts/primitive-types) in JavaScript. You don't need any prior regex experience — we'll start from the basics. </Warning>A regular expression is a pattern used to match character combinations in strings. In JavaScript, regex are objects that you can use with string methods to search, validate, extract, and replace text. They use a special syntax where characters like \d, *, and ^ have special meanings beyond their literal values. Regular expressions have been part of JavaScript since its first version in 1995, and the ECMAScript specification has steadily expanded their capabilities — adding features like named capture groups (ES2018), lookbehind assertions (ES2018), and the d flag for match indices (ES2022).
// 1. Literal syntax (preferred for static patterns)
const pattern1 = /hello/
// 2. Constructor syntax (useful for dynamic patterns)
const pattern2 = new RegExp('hello')
// Both work the same way
console.log(pattern1.test('hello world')) // true
console.log(pattern2.test('hello world')) // true
Use the literal syntax when you know the pattern ahead of time. Use the constructor when you need to build patterns dynamically, like from user input. As MDN explains, literal regex are compiled when the script loads, while RegExp constructor patterns are compiled at runtime — making literals slightly more efficient for static patterns:
function findWord(text, word) {
const pattern = new RegExp(word, 'gi') // case-insensitive, global
return text.match(pattern)
}
console.log(findWord('Hello hello HELLO', 'hello')) // ["Hello", "hello", "HELLO"]
Think of regex like giving a detective a description to find suspects in a crowd:
abc) — "Find someone named 'abc'"[aeiou]) — "Find someone with a vowel in their name"a+) — "Find someone with one or more 'a's in their name"^, $) — "They must be at the start/end of the line"┌─────────────────────────────────────────────────────────────────────────┐
│ REGEX PATTERN MATCHING │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Pattern: /\d{3}-\d{4}/ String: "Call 555-1234 today" │
│ │
│ Step 1: Find 3 digits (\d{3}) → "555" ✓ │
│ Step 2: Find a hyphen (-) → "-" ✓ │
│ Step 3: Find 4 digits (\d{4}) → "1234" ✓ │
│ │
│ Result: Match found! → "555-1234" │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Character classes let you match types of characters rather than specific ones.
| Pattern | Matches | Example |
|---|---|---|
. | Any character except newline | /a.c/ matches "abc", "a1c", "a-c" |
\d | Any digit [0-9] | /\d{3}/ matches "123" |
\D | Any non-digit | /\D+/ matches "abc" |
\w | Word character [A-Za-z0-9_] | /\w+/ matches "hello_123" |
\W | Non-word character | /\W/ matches "!" or " " |
\s | Whitespace (space, tab, newline) | /\s+/ matches " " |
\S | Non-whitespace | /\S+/ matches "hello" |
[abc] | Any of a, b, or c | /[aeiou]/ matches any vowel |
[^abc] | Not a, b, or c | /[^0-9]/ matches non-digits |
[a-z] | Character range | /[A-Za-z]/ matches any letter |
// Match a phone number pattern: 3 digits, hyphen, 4 digits
const phone = /\d{3}-\d{4}/
console.log(phone.test('555-1234')) // true
console.log(phone.test('55-1234')) // false
// Match words (letters, digits, underscores)
const words = 'hello_world 123 test!'
console.log(words.match(/\w+/g)) // ["hello_world", "123", "test"]
Quantifiers specify how many times a pattern should repeat.
| Quantifier | Meaning | Example |
|---|---|---|
* | 0 or more | /ab*c/ matches "ac", "abc", "abbbbc" |
+ | 1 or more | /ab+c/ matches "abc", "abbbbc" (not "ac") |
? | 0 or 1 (optional) | /colou?r/ matches "color", "colour" |
{n} | Exactly n times | /\d{4}/ matches "2024" |
{n,} | n or more times | /\d{2,}/ matches "12", "123", "1234" |
{n,m} | Between n and m times | /\d{2,4}/ matches "12", "123", "1234" |
// Match optional 's' for plural
const plural = /apple(s)?/
console.log(plural.test('apple')) // true
console.log(plural.test('apples')) // true
// Match 1 or more digits
const numbers = 'I have 42 apples and 7 oranges'
console.log(numbers.match(/\d+/g)) // ["42", "7"]
Anchors match positions in the string, not characters.
| Anchor | Position |
|---|---|
^ | Start of string (or line with m flag) |
$ | End of string (or line with m flag) |
\b | Word boundary |
\B | Not a word boundary |
// Must start with "Hello"
console.log(/^Hello/.test('Hello World')) // true
console.log(/^Hello/.test('Say Hello')) // false
// Must end with a digit
console.log(/\d$/.test('Room 42')) // true
console.log(/\d$/.test('42 rooms')) // false
// Word boundaries prevent partial matches
console.log(/\bcat\b/.test('cat')) // true
console.log(/\bcat\b/.test('category')) // false (cat is part of a larger word)
JavaScript provides several methods for working with regular expressions:
| Method | Returns | Use Case |
|---|---|---|
regex.test(str) | true or false | Simple validation |
str.match(regex) | Array or null | Find matches |
str.matchAll(regex) | Iterator | Find all matches with details |
str.search(regex) | Index or -1 | Find position of first match |
str.replace(regex, replacement) | New string | Replace matches |
str.split(regex) | Array | Split by pattern |
regex.exec(str) | Match array or null | Detailed match info (stateful) |
const emailPattern = /\S+@\S+\.\S+/
console.log(emailPattern.test('[email protected]')) // true
console.log(emailPattern.test('invalid-email')) // false
const text = 'My numbers: 123, 456, 789'
// Without 'g' flag: returns first match with details
console.log(text.match(/\d+/))
// ["123", index: 12, input: "My numbers: 123, 456, 789"]
// With 'g' flag: returns all matches
console.log(text.match(/\d+/g))
// ["123", "456", "789"]
When you need all matches AND details (like captured groups), use matchAll(). It requires the g flag and returns an iterator:
const text = 'Call 555-1234 or 555-5678'
const pattern = /(\d{3})-(\d{4})/g
for (const match of text.matchAll(pattern)) {
console.log(`Found: ${match[0]}, Prefix: ${match[1]}, Number: ${match[2]}`)
}
// "Found: 555-1234, Prefix: 555, Number: 1234"
// "Found: 555-5678, Prefix: 555, Number: 5678"
const text = 'Hello World'
console.log(text.search(/World/)) // 6 (index where match starts)
console.log(text.search(/xyz/)) // -1 (not found)
// Replace first occurrence
console.log('hello world'.replace(/o/, '0'))
// "hell0 world"
// Replace all occurrences (with 'g' flag)
console.log('hello world'.replace(/o/g, '0'))
// "hell0 w0rld"
// Use captured groups in replacement
console.log('John Smith'.replace(/(\w+) (\w+)/, '$2, $1'))
// "Smith, John"
// Split on one or more whitespace characters
const words = 'hello world foo'.split(/\s+/)
console.log(words) // ["hello", "world", "foo"]
// Split on commas with optional spaces
const items = 'a, b,c , d'.split(/\s*,\s*/)
console.log(items) // ["a", "b", "c", "d"]
exec() is similar to match() but is called on the regex. With the g flag, calling it repeatedly finds the next match each time:
const pattern = /\d+/g
const text = 'a1b22c333'
console.log(pattern.exec(text)) // ["1", index: 1]
console.log(pattern.exec(text)) // ["22", index: 3]
console.log(pattern.exec(text)) // ["333", index: 6]
console.log(pattern.exec(text)) // null (no more matches)
Flags modify how the pattern matches. Add them after the closing slash.
| Flag | Name | Effect |
|---|---|---|
g | Global | Find all matches, not just the first |
i | Case-insensitive | a matches A |
m | Multiline | ^ and $ match at each line's start/end |
s | DotAll | . matches newlines too |
// Case-insensitive matching
console.log(/hello/i.test('HELLO')) // true
// Global: find all matches
console.log('abcabc'.match(/a/g)) // ["a", "a"]
console.log('abcabc'.match(/a/)) // ["a", index: 0, input: "abcabc", ...] (first match with details)
// Multiline: ^ and $ match each line
const multiline = 'line1\nline2\nline3'
console.log(multiline.match(/^line\d/gm)) // ["line1", "line2", "line3"]
Parentheses () create capturing groups that let you extract parts of a match.
// Extract area code and number separately
const phonePattern = /\((\d{3})\) (\d{3}-\d{4})/
const match = '(555) 123-4567'.match(phonePattern)
console.log(match[0]) // "(555) 123-4567" (full match)
console.log(match[1]) // "555" (first group)
console.log(match[2]) // "123-4567" (second group)
Use (?<name>pattern) to give groups meaningful names. Named groups were introduced in ES2018 and are documented on MDN's groups and backreferences page:
const datePattern = /(?<month>\d{2})-(?<day>\d{2})-(?<year>\d{4})/
const match = '12-25-2024'.match(datePattern)
console.log(match.groups.month) // "12"
console.log(match.groups.day) // "25"
console.log(match.groups.year) // "2024"
Reference captured groups with $1, $2, etc. (or $<name> for named groups):
// Reformat date from MM-DD-YYYY to YYYY/MM/DD
const date = '12-25-2024'
const reformatted = date.replace(
/(\d{2})-(\d{2})-(\d{4})/,
'$3/$1/$2'
)
console.log(reformatted) // "2024/12/25"
By default, quantifiers are greedy. They match as much as possible. Add ? to make them lazy (match as little as possible).
┌─────────────────────────────────────────────────────────────────────────┐
│ GREEDY VS LAZY │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ String: "<div>Hello</div><div>World</div>" │
│ │
│ GREEDY: /<div>.*<\/div>/ LAZY: /<div>.*?<\/div>/ │
│ Matches: "<div>Hello</div> Matches: "<div>Hello</div>" │
│ <div>World</div>" │
│ (Everything from first (Just the first div) │
│ <div> to LAST </div>) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
const html = '<div>Hello</div><div>World</div>'
// Greedy: matches everything between first <div> and LAST </div>
console.log(html.match(/<div>.*<\/div>/)[0])
// "<div>Hello</div><div>World</div>"
// Lazy: stops at first </div>
console.log(html.match(/<div>.*?<\/div>/)[0])
// "<div>Hello</div>"
Here are some practical patterns you can use in your projects:
// Email (basic validation)
const email = /^[^\s@]+@[^\s@]+\.[^\s@]+$/
console.log(email.test('[email protected]')) // true
// URL
const url = /^https?:\/\/[^\s]+$/
console.log(url.test('https://example.com/path')) // true
// Phone (US format: 123-456-7890 or (123) 456-7890)
const phone = /^(\(\d{3}\)|\d{3})[-.\s]?\d{3}[-.\s]?\d{4}$/
console.log(phone.test('(555) 123-4567')) // true
console.log(phone.test('555-123-4567')) // true
// Username (alphanumeric, 3-16 chars)
const username = /^[a-zA-Z0-9_]{3,16}$/
console.log(username.test('john_doe123')) // true
Regex = patterns for strings — They describe what you're looking for, not literal text
Two ways to create — /pattern/ literals or new RegExp('pattern')
Character classes — \d (digits), \w (word chars), \s (whitespace), . (any)
Quantifiers — * (0+), + (1+), ? (0-1), {n,m} (specific range)
Anchors — ^ (start), $ (end), \b (word boundary)
test() for validation — Returns true/false
match() for extraction — Returns matches or null
Flags change behavior — g (global), i (case-insensitive), m (multiline)
Groups capture parts — Use () to extract portions of matches
Greedy vs lazy — Add ? after quantifiers to match minimally
Both create a regex object, but they differ in when to use them:
- **Literal `/pattern/`** — Use for static patterns known at write time. The pattern is compiled when the script loads.
- **`new RegExp('pattern')`** — Use for dynamic patterns built at runtime (e.g., from user input). Remember to escape backslashes: `new RegExp('\\d+')`.
```javascript
// Static pattern - use literal
const digits = /\d+/
// Dynamic pattern - use constructor
const searchTerm = 'hello'
const dynamic = new RegExp(searchTerm, 'gi')
```
`\b` matches a **word boundary** — the position between a word character (`\w`) and a non-word character. It doesn't match any actual character; it matches a position.
```javascript
// \b prevents partial matches
console.log(/\bcat\b/.test('cat')) // true
console.log(/\bcat\b/.test('category')) // false
console.log(/\bcat\b/.test('the cat')) // true
```
Word boundaries are useful when you want to match whole words only.
Add a `?` after the quantifier to make it lazy (non-greedy):
- `*?` — Match 0 or more, as few as possible
- `+?` — Match 1 or more, as few as possible
- `??` — Match 0 or 1, preferring 0
- `{n,m}?` — Match between n and m, as few as possible
```javascript
const text = '<b>bold</b> and <b>more bold</b>'
// Greedy: matches everything between first <b> and last </b>
text.match(/<b>.*<\/b>/)[0] // "<b>bold</b> and <b>more bold</b>"
// Lazy: matches just the first <b>...</b>
text.match(/<b>.*?<\/b>/)[0] // "<b>bold</b>"
```
- **Without `g`**: Returns first match with full details (captured groups, index, input)
- **With `g`**: Returns array of all matches (just the matched strings, no details)
```javascript
const text = 'cat and cat'
// Without g: detailed info about first match
text.match(/cat/)
// ["cat", index: 0, input: "cat and cat"]
// With g: all matches, no details
text.match(/cat/g)
// ["cat", "cat"]
```
Use `matchAll()` if you need both all matches AND details for each.
Use `$1`, `$2`, etc. for numbered groups, or `$<name>` for named groups:
```javascript
// Numbered groups
'John Smith'.replace(/(\w+) (\w+)/, '$2, $1')
// "Smith, John"
// Named groups
'2024-12-25'.replace(
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/,
'$<month>/$<day>/$<year>'
)
// "12/25/2024"
// $& references the entire match
'hello'.replace(/\w+/, '[$&]')
// "[hello]"
```
Escape special characters with a backslash `\`. Characters that need escaping: `. * + ? ^ $ { } [ ] \ | ( )` and `/` in literal syntax
```javascript
// Match a literal period
/\./.test('file.txt') // true
/\./.test('filetxt') // false
// Match a literal dollar sign
/\$\d+/.test('$100') // true
// When using RegExp constructor, double-escape
new RegExp('\\d+\\.\\d+') // matches "3.14"
```
For dynamic patterns from user input, escape all special chars:
```javascript
function escapeRegex(string) {
return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')
}
const userInput = 'hello.world'
const pattern = new RegExp(escapeRegex(userInput))
pattern.test('hello.world') // true
pattern.test('helloXworld') // false
```