regexp(JavaScript Variable data type)

Japanese version.

The RegExp object in JavaScript is used to represent regular expressions, which are used for various string manipulation operations such as searching and replacing. The RegExp object is created by specifying a regular expression pattern.

A regular expression pattern consists of the following elements:

  • String: representing an exact match for that string.
  • Special character: representing a set of characters, such as [a-z] for lowercase alphabets.
  • Pattern modifiers: such as g (global), i (case-insensitive), m (multi-line), etc.

The RegExp object is created by specifying a string containing the regular expression pattern and optionally specifying pattern modifiers as an option.

Here's an example of creating a regular expression pattern using the RegExp object:

// Create a string containing the regular expression pattern
var pattern = "test";

// Create a RegExp object with pattern modifiers
var regexp = new RegExp(pattern, "gi");

In this example, we create a string containing the regular expression pattern "test", and then create a RegExp object by specifying the pattern and the pattern modifiers gi, which enables global searching and case-insensitive searching.

The RegExp object has many methods for searching for strings that match the regular expression pattern. The test() method checks if a string matches the regular expression pattern, while the exec() method searches for substrings that match the pattern.

Here's an example of using the test() and exec() methods of the RegExp object to search for a regular expression pattern within a string:

// Create a string containing the regular expression pattern
var pattern = "test";

// Create a RegExp object with pattern modifiers
var regexp = new RegExp(pattern, "gi");

// Specify the string to search
var string = "This is a test. This is only a test.";

// Search using the test() method
var result1 = regexp.test(string);

// Search using the exec() method
var result2 = regexp.exec(string);

// Output the results
console.log(result1);
console.log(result2);

In this example, we create a RegExp object and specify a string to search for the regular expression pattern. We then use the test() method to search for the pattern and store the result in result1. We also use the exec() method to search for the pattern and store the result in result2. Finally, we output the results.

The RegExp object can be used to search for and replace substrings that match the regular expression pattern using the replace() method of the String object.

Here's an example of using the RegExp object to replace substrings that match a regular expression pattern within a string:

// Create a string containing the regular expression pattern
var pattern = "test";

// Create a RegExp object with pattern modifiers
var regexp = new RegExp(pattern, "gi");

// Specify the string to search
var string = "This is a test. This is only a test.";

// Replace using the replace() method
var result = string.replace(regexp, "example");

// Output the result
console.log(result);

In this example, we create a RegExp object and specify a string to search for the regular expression pattern. We then use the replace() method to replace all substrings that match the pattern with the string "example". Finally, we output the result.

Special character

Character classes

Distinguishes between different types of characters, such as letters and numbers.

Any one character

. // Match any character except line breaks

Single characters

To match a single character, simply specify the character within the character class.

[abc] // matches a, b, or c

Ranges

To match a range of characters, use a dash - to specify the first and last characters in the range.

[a-z] // matches any lowercase letter
[A-Z] // matches any uppercase letter
[0-9] // matches any digit

Negation

To create a negated character class, use a caret ^ at the beginning of the class.

[^abc] // matches any character except a, b, or c

Multiple character classes

To specify multiple character classes, include them within the same character class.

[a-zA-Z] // matches any lowercase or uppercase letter
[a-zA-Z0-9] // matches any alphanumeric character
[^\s] // matches any non-whitespace character

Character set

  • \d: Matches a digit character. Same as [0-9].
  • \D: Matches a non-digit character. Same as [^0-9].
  • \s: Matches a whitespace character. Includes tab, newline, carriage return, and space characters.
  • \S: Matches a non-whitespace character.
  • \w: Matches a word character. Includes alphanumeric and underscore characters.
  • \W: Matches a non-word character.
  • \f: Matches a form feed character.
  • \n: Matches a newline character.
  • \r: Matches a carriage return character.
  • \t: Matches a tab character.
  • \v: Matches a vertical tab character.
  • \0: Matches a null character.
  • [\b]: Matches a backspace character.

Quantifiers

Quantifiers indicate the number of letters or expressions to be matched.

  • *: Matches the preceding character zero or more times. For example, the regular expression /ab*c/ matches ac, abc, abbc, and so on.
  • +: Matches the preceding character one or more times. For example, the regular expression /ab+c/ matches abc, abbc, abbbc, and so on, but not ac.
  • ?: Matches the preceding character zero or one time. For example, the regular expression /colou?r/ matches both color and colour.
  • {n}: Matches the preceding character exactly n times. For example, the regular expression /a{3}/ matches aaa.
  • {n,}: Matches the preceding character n or more times. For example, the regular expression /a{3,}/ matches aaa, aaaa, aaaaa, and so on.
  • {n,m}: Matches the preceding character at least n times and at most m times. For example, the regular expression /a{3,5}/ matches aaa, aaaa, aaaaa, but not aa or aaaaaa.

Assertions

Specify position and boundaries.

  • ^: Matches the beginning of a string. For example, the regular expression /^hello/ matches only the "hello" at the beginning of a string like "hello world."
  • $: Matches the end of a string. For example, the regular expression /world$/ matches only the "world" at the end of a string like "hello world."
  • \b: Matches a word boundary.
  • \B: Matches a non-word boundary.

Pattern modifiers

JavaScript regular expressions have several option flags that can be used to modify their behavior. Here are some of the common flags:

  • g (Global): Searches for all occurrences of the regular expression. Without this flag, only the first occurrence is searched.
  • i (Ignore case): Searches case-insensitively.
  • m (Multiline): Searches across multiple lines of text.
  • s (Dot all): Makes the . metacharacter match any character, including newline characters.
  • u (Unicode): Enables support for Unicode. This flag is useful for handling Unicode characters such as surrogate pairs and combining characters.
  • y (Sticky): Matches the regular expression starting from the last matched position. Without this flag, all substrings are searched.

Capturing Groups

Capturing groups are groups that include the matched part of the string in the match result. If the match succeeds, the captured part of the string will be stored in the capturing group.

const regex = /(\d+)-(\d+)-(\d+)/;
const str = '2022-04-28';
const match = str.match(regex);

console.log(match); // [ '2022-04-28', '2022', '04', '28', index: 0, input: '2022-04-28', groups: undefined ]

In this example, the regular expression (\d+)-(\d+)-(\d+) is used to match the string 2022-04-28. This regular expression has three capturing groups. If the match succeeds, the match result array will include the captured parts of the string. For example, the year 2022 will be stored in match[1].

Methods

search()

The search() method of regular expressions returns the index of the first match of a regular expression in a string. If multiple matches are found, it returns the index of the first match. If no match is found, it returns -1.

Here's an example of using the search() method:

const str = 'The quick brown fox jumps over the lazy dog';
const result = str.search(/fox/);

console.log(result); // 16

In this example, we search for the first substring that matches the regular expression /fox/ in the string 'The quick brown fox jumps over the lazy dog'. The regular expression matches the substring 'fox' starting at position 16 in the string, so the search() method returns 16.

The search() method can be useful for finding the position of the first substring that matches a regular expression in a string. The following example uses the regular expression /\d+/ to find the first number in a string:

const str = 'I have 2 cats and 3 dogs';
const result = str.search(/\d+/);

console.log(result); // 7

In this example, we search for the first substring that matches the regular expression /\d+/ in the string 'I have 2 cats and 3 dogs'. The regular expression matches the substring '2' starting at position 7 in the string, so the search() method returns 7.

The search() method can be used instead of the indexOf() method to find the position of a substring in a string. However, the search() method allows for more flexible searching using regular expressions.

If you want all (multiple) results, not just the first, or if you want information other than character counts, such as the string of matches, exec() is more appropriate.

If you only want to check whether a match exists, test() is more appropriate.

exec()

The exec() method is a method in JavaScript's regular expression that searches for the first substring that matches the regular expression within a given string and returns an array that contains the matched substring, the index of the matched substring, the input string, and the remaining part of the string after the matched substring. If no match is found, null is returned.

Here's an example of using the exec() method:

const regex = /hello/g;
const str = 'hello world';
let match = regex.exec(str);

console.log(match); // ['hello']
console.log(match.index); // 0
console.log(match.input); // 'hello world'

match = regex.exec(str);
console.log(match); // null

In this example, we define a regular expression /hello/g and call the exec() method on the string 'hello world'. In the first call, a substring that matches the string 'hello' is found, and the matched substring 'hello' is stored in the match array. The match.index property stores the index of the matched substring, and the match.input property stores the input string 'hello world'. In the second call, no matching substring is found, so null is returned.

The exec() method can also be used when the regular expression contains groups. The substring that matches the group is added as an element to the array. Here's an example of using the exec() method with a regular expression that contains a group:

const regex = /(\w+)\s(\w+)/;
const str = 'John Smith';
const match = regex.exec(str);

console.log(match); // ['John Smith', 'John', 'Smith']
console.log(match[1]); // 'John'
console.log(match[2]); // 'Smith'

In this example, we define a regular expression (\w+)\s(\w+) and call the exec() method on the string 'John Smith'. The matched substring 'John Smith' and the substrings that match the groups 'John' and 'Smith' are stored in the match array. match[1] stores the substring that matches the first group, and match[2] stores the substring that matches the second group.

The exec() method can also be called repeatedly to search for all substrings in a string that match a regular expression. The following example uses the regular expression /\d+/g to search for all numbers in a string:

const regex = /\d+/g;
const str = 'I have 2 cats and 3 dogs';

let result;
while ((result = regex.exec(str)) !== null) {
  console.log(result[0]); // 2, 3
}

In this example, we use a while loop to repeatedly call the test() method to search for all numbers in a string. The regex.exec(str) method returns a matching substring, and the while loop repeats until all matching substrings have been processed.

test()

The test() method returns a Boolean value indicating whether or not a specified string matches a regular expression. It can be useful for checking if a part of a string matches a regular expression.

Here's an example of using the test() method:

const regex = /world/;
const str = 'hello world';

const result = regex.test(str);
console.log(result); // true

In this example, we compare the string 'hello world' with the regular expression /world/. Since there is a substring 'world' in the string, the test() method returns true.

The test() method can also be used to search for all substrings in a string that match a regular expression. The following example uses the regular expression /\d+/ to search for numbers in a string:

const regex = /\d+/g;
const str = 'I have 2 cats and 3 dogs';
const result = regex.test(str);

console.log(result); // true

In this example, we compare the string 'I have 2 cats and 3 dogs' with the regular expression /\d+/g. This regular expression matches one or more digits in the string, so the test() method returns true.

match()

The match() method of regular expressions searches for and returns an array of all matches of a pattern in a string. The argument passed can be either the string to search or a RegExp object.

const str = 'The quick brown fox jumps over the lazy dog';
const regex = /the/gi;
const result = str.match(regex);

console.log(result); // ['The', 'the']

In this example, we search for strings that match the regular expression /the/gi from the string 'The quick brown fox jumps over the lazy dog'. Since the regular expression matches 'the' regardless of case, the match() method returns an array ['The', 'the']. The array contains two elements because the string contains two instances of 'the'.

If the regular expression has groups, the match() method also includes in the returned array the substrings that match those groups. Here's an example:

const str = 'The quick brown fox jumps over the lazy dog';
const regex = /the (\w+)/gi;
const result = str.match(regex);

console.log(result); // ['The quick', 'the lazy']

In this example, we search for strings that match the regular expression /the (\w+)/gi from the string 'The quick brown fox jumps over the lazy dog'. The regular expression matches the string 'the' regardless of case, followed by a space and one or more word characters (letters, digits, or underscores). Since there are three matches in the string, the match() method returns an array ['The quick','the lazy'].

replace()

The replace() method is one of the regular expression methods in JavaScript used to replace a matching portion in a string. The replace() method uses a specified regular expression pattern or string to replace the first matched portion or all matched portions with another string.

In the following example, the first "apple" in the string is replaced with "orange":

const str = 'I have an apple and she has an apple too.';
const newStr = str.replace(/apple/, 'orange');

console.log(newStr); // "I have an orange and she has an apple too."

In the example above, since the string contains apple, only the first match of apple is replaced.

When using regular expressions, all matches can be replaced by specifying the g flag.

const str = 'I have an apple and she has an apple too.';
const newStr = str.replace(/apple/g, 'orange');

console.log(newStr); // "I have an orange and she has an orange too."

In the example above, the regular expression /apple/g is used, and all matches in the string are replaced.

Also, the replace() method can be used with a replacement function. The replacement function receives the matched portion and information about the matched string's position and returns the replacement text.

const str = 'I have an apple and she has an apple too.';
const newStr = str.replace(/apple/g, (match, offset) => {
  if (offset === 10) {
    return 'orange';
  }
  return 'banana';
});

console.log(newStr); // "I have an orange and she has an banana too."

In the example above, a replacement function that receives the matched portion and its position is used. Only the matched portion at the 10th character position is replaced with "orange", and the other matched portions are replaced with "banana".

split()

The split() method of the RegExp object splits a string based on a specified regular expression and returns an array of the resulting string.

const regex = / [,-\.]/;
const str = ' He,llo-wor.ld';
const result = str.split(regex);

console.log(result); // ['He', 'llo', 'wor', 'ld']

In this example, the split() method is used to split a string with whitespace characters. The regular expression [,-\.]. represents one of ",", "-", or ".". The split() method splits the string based on this regular expression and returns an array ['He', 'llo', 'wor', 'ld'].

---

Links

JavaScript Articles