SQL – UNION Operator
The SQL UNION Operator
The SQL UNION operator is used to combine data from multiple tables by eliminating duplicate rows (if any).
To use the UNION operator on multiple tables, all these tables must be union compatible. And they are said to be union compatible if and only if they meet the following criteria −
- The same number of columns selected with the same datatype.
- These columns must also be in the same order.
- They need not have same number of rows.
Once these criterion are met, the UNION operator returns the rows from multiple tables as a resultant table which is void of all duplicate values from these tables.
The column names in the final result set will be based on the column names selected in the first SELECT statement. If you want to use a different name for a column in the final result set, you can use an alias in the SELECT statement.
Syntax
The basic syntax of a UNION operator is as follows −
SELECT column1 [, column2 ] FROM table1 [, table2 ] [WHERE condition] UNION SELECT column1 [, column2 ] FROM table1 [, table2 ] [WHERE condition];
Here, the given condition could be any given expression based on your requirement.
UNION on a Single Field
If we want to use UNION to combine the result sets of two or more SELECT statements on a single field, we can simply include that field in the SELECT statement of each query. The UNION operator will automatically remove any duplicate values in the final result set.
When using UNION on a single field, the column names in the result set will be determined by the column name in the first SELECT statement. Therefore, you may need to use an alias in the SELECT statement to ensure that the column name is meaningful for the final result set.
Example
Assume we have created a table with name CUSTOMERS in MySQL database using CREATE TABLE statement as shown below −
CREATE TABLE CUSTOMERS ( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL, ADDRESS CHAR (25), SALARY DECIMAL (18, 2), PRIMARY KEY (ID) );
Following query inserts values into this table using the INSERT statement −
INSERT INTO CUSTOMERS VALUES (1, ''Ramesh'', 32, ''Ahmedabad'', 2000.00), (2, ''Khilan'', 25, ''Delhi'', 1500.00), (3, ''Kaushik'', 23, ''Kota'', 2000.00), (4, ''Chaitali'', 25, ''Mumbai'', 6500.00), (5, ''Hardik'', 27, ''Bhopal'', 8500.00), (6, ''Komal'', 22, ''Hyderabad'', 4500.00), (7, ''Muffy'', 24, ''Indore'', 10000.00);
The CUSTOMERS table is as follows −
ID | NAME | AGE | ADDRESS | SALARY |
---|---|---|---|---|
1 | Ramesh | 32 | Ahmedabad | 2000.00 |
2 | Khilan | 25 | Delhi | 1500.00 |
3 | Kaushik | 23 | Kota | 2000.00 |
4 | Chaitali | 25 | Mumbai | 6500.00 |
5 | Hardik | 27 | Bhopal | 8500.00 |
6 | Komal | 22 | Hyderabad | 4500.00 |
7 | Muffy | 24 | Indore | 10000.00 |
Now, creating the second table ORDERS using CREATE TABLE statement as shown below −
CREATE TABLE ORDERS ( OID INT NOT NULL, DATE DATETIME NOT NULL, CUSTOMER_ID INT NOT NULL, AMOUNT INT NOT NULL, PRIMARY KEY (OID) );
Following query inserts values into this table using the INSERT statement −
INSERT INTO ORDERS VALUES (102, ''2009-10-08 00:00:00'', 3, 3000), (100, ''2009-10-08 00:00:00'', 3, 1500), (101, ''2009-11-20 00:00:00'', 2, 1560), (103, ''2008-05-20 00:00:00'', 4, 2060);
The ORDERS table is as follows −
OID | DATE | CUSTOMER_ID | AMOUNT |
---|---|---|---|
102 | 2009-10-08 00:00:00 | 3 | 3000.00 |
100 | 2009-10-08 00:00:00 | 3 | 1500.00 |
101 | 2009-11-20 00:00:00 | 2 | 1560.00 |
103 | 2008-05-20 00:00:00 | 4 | 2060.00 |
Using the following query, let us combine the SALARY and AMOUNT columns from CUSTOMERS and ORDERS table (since these columns have similar datatypes) −
SELECT SALARY FROM CUSTOMERS UNION SELECT AMOUNT FROM ORDERS;
Output
Output of the above query is as follows −
SALARY |
---|
2000.00 |
1500.00 |
6500.00 |
8500.00 |
4500.00 |
10000.00 |
3000.00 |
1560.00 |
2060.00 |
UNION on Multiple Fields
When we use UNION on multiple fields, the number and order of the fields in each SELECT statement must match. Also, the data types of the fields in each SELECT statement must be compatible for the UNION to work correctly. If the data types are not compatible, you may need to use conversion functions such as CAST or CONVERT to ensure that the data types match.
Example
As the CUSTOMERS and ORDERS tables are not union-compatible individually, let us first join these two tables into a bigger table using Left Join and Right Join. The joined tables retrieved will have same number of columns with same datatypes, becoming union compatible. Now, these tables are combined using UNION query shown below −
SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS LEFT JOIN ORDERS ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID UNION SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS RIGHT JOIN ORDERS ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;
Output
This would produce the following result −
ID | NAME | AMOUNT | DATE |
---|---|---|---|
1 | Ramesh | NULL | NULL |
2 | Khilan | 1560 | 2009-11-20 00:00:00 |
3 | Kaushik | 3000 | 2009-10-08 00:00:00 |
3 | Kaushik | 1500 | 2009-10-08 00:00:00 |
4 | Chaitali | 2060 | 2008-05-20 00:00:00 |
5 | Hardik | NULL | NULL |
6 | Komal | NULL | NULL |
7 | Muffy | NULL | NULL |
UNION with WHERE Clause
We can use the WHERE clause with UNION operator to filter the results of each SELECT statement before combining them.
Syntax
Following is the syntax for using the WHERE clause with UNION operator −
SELECT column1, column2, column3 FROM table1 WHERE column1 = ''value1'' UNION SELECT column1, column2, column3 FROM table2 WHERE column1 = ''value2
Example
In the following query, we are retrieving the id”s of the customers where id is greater than 5 and 2 from the ”CUSTOMERS” and ”ORDERS” tables respectively −
SELECT ID, SALARY FROM CUSTOMERS WHERE ID > 5 UNION SELECT CUSTOMER_ID, AMOUNT FROM ORDERS WHERE CUSTOMER_ID > 2;
Output
Following is the result produced −
ID | SALARY |
---|---|
6 | 4500.00 |
7 | 10000.00 |
3 | 3000.00 |
3 | 1500.00 |
4 | 2060.00 |
UNION with ORDER BY Clause
When we use UNION with ORDER BY clause, it combines the sorted result sets of all SELECT statements and produces a single sorted result set.
Example
In here, we are retrieving the id”s of the customers where id is greater than 5 and 2 from the ”CUSTOMERS” and ”ORDERS” tables respectively, sorted low to high from their salary −
SELECT ID, SALARY FROM CUSTOMERS WHERE ID > 5 UNION SELECT CUSTOMER_ID, AMOUNT FROM ORDERS WHERE CUSTOMER_ID > 2 ORDER BY SALARY;
Output
Following is the output of the above query −
ID | SALARY |
---|---|
3 | 1500.00 |
4 | 2060.00 |
3 | 3000.00 |
6 | 4500.00 |
7 | 10000.00 |
The ORDER BY clause in a UNION statement applies to the entire result set, not just the last SELECT statement.
UNION with Aliases
We can use aliases in the SELECT statement of UNION operator to give a table or column a temporary name, which can be useful when working with multiple tables or columns with similar names.
When using UNION with aliases, it”s important to note that the column aliases are determined by the first SELECT statement. Therefore, if you want to use different aliases for the same column in different SELECT statements, you need to use column aliases in all SELECT statements to ensure consistent column names in the final result set.
Syntax
Following is the syntax for using Union with Aliases −
SELECT column1 AS alias1, column2 AS alias2 FROM table1 UNION SELECT column3 AS alias1, column4 AS alias2 FROM table2;
Example
The following query retrieves all the id”s from both tables, along with an indication of whether each id is of the customer or the order made by them −
SELECT ID, ''customer'' AS type FROM CUSTOMERS UNION SELECT OID, ''order'' AS type FROM ORDERS;
Output
Following is the output produced −
ID | type |
---|---|
1 | customer |
2 | customer |
3 | customer |
4 | customer |
5 | customer |
6 | customer |
7 | customer |
100 | order |
101 | order |
102 | order |
103 | order |
There are two other operators which are like the UNION operator.
-
SQL − This is used to combine two SELECT statements, but returns rows only from the first SELECT statement that are identical to a row in the second SELECT statement.
-
SQL − This combines two SELECT statements and returns rows from the first SELECT statement that are not returned by the second SELECT statement.