MySQL – Regular Expressions
MySQL supports various types of pattern matching operations to retrieve filtered result-sets from huge database tables. In previous chapters, we have already learned about the LIKE operator for pattern matching. In this chapter, we will see another pattern matching operation based on regular expressions.
MySQL Regular Expressions
A regular expression is loosely defined as a sequence of characters that represent a pattern in an input text. It is used to locate or replace text strings using some patterns; this pattern can either be a single character, multiple characters or words, etc.
MySQL implements regular expression support using International Components for Unicode (ICU), which provides full Unicode support and is multi-byte safe.
In MySQL, it is a powerful way to perform a complex search operations in a database to retrieve desired content. And unlike the LIKE operator, the regular expressions are not restricted on search patterns (like % and _) as they use several other meta characters to expand the flexibility and control during pattern matching. This is performed using the REGEXP operator.
Syntax
Following is the basic syntax of the REGEXP operator in MySQL −
expression REGEXP pattern
Patterns used with REGEXP
Following is the table of pattern, which can be used along with the REGEXP operator.
Pattern | What the pattern matches |
---|---|
^ | Beginning of string |
$ | End of string |
. | Any single character |
[…] | Any character listed between the square brackets |
[^…] | Any character not listed between the square brackets |
p1|p2|p3 | Alternation; matches any of the patterns p1, p2, or p3 |
* | Zero or more instances of preceding element |
+ | One or more instances of preceding element |
{n} | n instances of preceding element |
{m,n} | m through n instances of preceding element |
[A-Z] | Any uppercase letter |
[a-z] | Any lowercase letter |
[0-9] | Any digit (from 0 to 9) |
[[:<:]] | Beginning of words |
[[:>:]] | Ending of words |
[:class:] | A character class, i.e. use [:alpha:] to match letters from the alphabet |
Examples
The following example demonstrates the usage of some patterns mentioned in the table above, along with the REGEXP operator. For that, we are first creating a database table to perform the search on.
Assume we are creating a table called CUSTOMERS using the following query −
CREATE TABLE CUSTOMERS ( ID INT AUTO_INCREMENT, NAME VARCHAR(20) NOT NULL, AGE INT NOT NULL, ADDRESS CHAR (25), SALARY DECIMAL (18, 2), PRIMARY KEY (ID) );
Now, insert some values into it using the INSERT statements given below −
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY) VALUES (1, ''Ramesh'', 32, ''Ahmedabad'', 2000.00 ), (2, ''Khilan'', 25, ''Delhi'', 1500.00 ), (3, ''Kaushik'', 23, ''Kota'', 2000.00 ), (4, ''Chaitali'', 25, ''Mumbai'', 6500.00 ), (5, ''Hardik'', 27, ''Bhopal'', 8500.00 ), (6, ''Komal'', 22, ''Hyderabad'', 4500.00 ), (7, ''Muffy'', 24, ''Indore'', 10000.00 );
Execute the following query to display all the records present in above created table −
SELECT * FROM CUSTOMERS;
Following are the records present in CUSTOMERS table −
ID | NAME | AGE | ADDRESS | SALARY |
---|---|---|---|---|
1 | Ramesh | 32 | Ahmedabad | 2000.00 |
2 | Khilan | 25 | Delhi | 1500.00 |
3 | Kaushik | 23 | Kota | 2000.00 |
4 | Chaitali | 25 | Mumbai | 6500.00 |
5 | Hardik | 27 | Bhopal | 8500.00 |
6 | Komal | 22 | Hyderabad | 4500.00 |
7 | Muffy | 24 | Indore | 10000.00 |
REGEXP with Patterns −
Now, we are finding all the records in the CUSTOMERS table whose name starts with ”k” −
SELECT * FROM CUSTOMERS WHERE NAME REGEXP ''^k
Executing the query above will produce the following output −
ID | NAME | AGE | ADDRESS | SALARY |
---|---|---|---|---|
2 | Khilan | 25 | Delhi | 1500.00 |
3 | Kaushik | 23 | Kota | 2000.00 |
6 | Komal | 22 | Hyderabad | 4500.00 |
The following query retrieves all records in CUSTOMERS table whose name ends with ”sh” −
SELECT * FROM CUSTOMERS WHERE NAME REGEXP ''sh$
Executing the query above will produce the following output −
ID | NAME | AGE | ADDRESS | SALARY |
---|---|---|---|---|
1 | Ramesh | 32 | Ahmedabad | 2000.00 |
Here, we are retrieving all the records whose name contain ”sh” −
SELECT * FROM CUSTOMERS WHERE NAME REGEXP ''sh
As we can see the output, there are only two names that contain ”sh”.
ID | NAME | AGE | ADDRESS | SALARY |
---|---|---|---|---|
1 | Ramesh | 32 | Ahmedabad | 2000.00 |
3 | Kaushik | 23 | Kota | 2000.00 |
In the following query, we are finding all the names starting with a vowel and ending with ”ol” −
SELECT * FROM CUSTOMERS WHERE NAME REGEXP ''^[aeiou].*ol$
It returned an empty set because the CUSTOMERS table do not have any names who starts with vowel and ends with ”ol”
Empty set (0.00 sec)
The following query finds all the names in the CUSTOMERS table whose name starts with a consonant −
SELECT * FROM CUSTOMERS WHERE NAME REGEXP ''^[^aeiou]
Executing the query above will produce the following output −
ID | NAME | AGE | ADDRESS | SALARY |
---|---|---|---|---|
1 | Ramesh | 32 | Ahmedabad | 2000.00 |
2 | Khilan | 25 | Delhi | 1500.00 |
3 | Kaushik | 23 | Kota | 2000.00 |
4 | Chaitali | 25 | Mumbai | 6500.00 |
5 | Hardik | 27 | Bhopal | 8500.00 |
6 | Komal | 22 | Hyderabad | 4500.00 |
7 | Muffy | 24 | Indore | 10000.00 |
Regular Expression Functions and Operators
Following is the table of functions and operators of regular expressions.
S. No | Function or Operator |
---|---|
1 |
NOT REGEXP Negation of REGEXP |
2 |
REGEXP Checks whether the string matches regular expression or not |
3 |
REGEXP_INSTR() Returns the starting index of substring matching regular expression |
4 |
REGEXP_LIKE() Returns whether the string matches the regular expression |
5 |
REGEXP_REPLACE() Replaces substrings matching the regular expression |
6 |
REGEXP_SUBSTR() Returns substrings matching the regular expression |
7 |
RLIKE Checks whether the string matches regular expression or not |