UNION statement in SQL
UNION
is a statement or operator that allows you to join 2 or more SELECT
by concatenating the results of execution into a single result set. Unlike the JOIN
clause, which adds the specific columns of each table to the result set, UNION
does not create individual records from the columns of the tables participating in the join, but rather concatenates the set of their results.
This operation is subject to strict rules that cannot be violated:
-
The
SELECT
that participate in theUNION
must have the same number of columns. -
The columns have to be paired in order, being of the same data type or otherwise, compatible data, for example:
FLOAT
andINT
…
The above rules show that a lot of times, UNION
is used to concatenate mirror tables or tables that are very similar in their definition, that although they store different information they share a similar data model.
We have to take into account that the name of the columns of the result set will be defined by the columns of the first SELECT
statement, so sometimes it may be advisable to put an alias
to some of the columns if the data we store is substantially different from the information collected from one table to another.
Also, note that UNION
excludes duplicate values. If we want our result set to include duplicates, we need to replace the UNION
with UNION ALL
.
As an example we are going to create first a new table named Employees
table that fits the model of the Clients
table, to later perform a UNION
of both.
In order for the user to replicate these same examples in his database, the CREATE TABLE
scripts and the INSERT INTO
of the new table data would be:
CREATE TABLE Employees(
employeeId INT IDENTITY(1,1),
[name] VARCHAR(255) NOT NULL ,
[lastName] VARCHAR(255) NOT NULL ,
[address] VARCHAR(255),
employeeNumber INT
CONSTRAINT pk_Employee_employeeId PRIMARY KEY (employeeId)
);
INSERT INTO Employees
([name],
[lastName],
[address],
employeeNumber)
VALUES
('Levi',
'Jones',
'6101 KLUMP AVE',
1001);
INSERT INTO Employees
([name],
[lastName],
[address],
employeeNumber)
VALUES
('Daniel',
'White',
'7939 BRIGHT AVE',
1002);
Once created and the data entered in the new Employees
table, we proceed to exemplify the use of the UNION
operator:
SELECT [name],
[lastName],
[address],
bankAccount as ClientAccountEmployeeNumber
FROM Customers
WHERE bankAccount IS NOT NULL
UNION
SELECT [name],
[lastName],
[address], employeeNumber
FROM Employees
ORDER BY ClientAccountEmployeeNumber ASC;
name | lastName | address | ClientAccountEmployeeNumber |
---|---|---|---|
Levi | Jones | 6101 KLUMP AVE | 1001 |
Daniel | White | 7939 BRIGHT AVE | 1002 |
Janice | Haynes | 4822 Radio Park Drive | 111222333 |
Mavin | Pettitt | 2336 Cottonwood Lane | 123456789 |
Peter | Davis | 3608 Sycamore Lake Road | 998344567 |
Helen | Ward | 711 Hershell Hollow Road | 447824556 |
Kimberly | Lee | 4298 Drummond Street | 778345112 |
Jessie | Good | 4452 Ventura Drive | 790763467 |
Paul | Williams | 2696 Ocello Street | 998344567 |
We can see that in a single result set we are returning the records from the Customers
and Employees
tables.
As we indicated before, since it is not a JOIN
, we cannot have independent columns to store the values of the bankAccount
field of the Customers
table, and the employeeNumber
field of the Employee
table, so all the values have to be stored in the same column.
Failing that, we have created an alias: ClientAccountEmployeeNumber
, to specify that said column can store both values, and given its nature, differentiate which values belong to a customer
and which to an employee
.
There are other set operators such as EXCEPT
and INTERSECT
that we will explain below using the Customers
table as an example:
INTERSECT
INTERSECT
returns only the records that are common to the union of all the queries:
SELECT customerId,
[name],
[lastName],
[address],
bankAccount
FROM Customers
WHERE customerId IN (1,2,3,4)
INTERSECT
SELECT customerId,
[name],
[lastName],
[address],
bankAccount
FROM Customers
WHERE customerId IN (4,5,6,7);
customerId | name | lastName | address | bankAccount |
---|---|---|---|---|
4 | Helen | Ward | 711 Hershell Hollow Road | 447824556 |
In this case, we can see that only the record with customerId
= 4, is returned by both queries.
EXCEPT
EXCEPT
returns only the unique records of the first query, those that do not match the rest:
SELECT customerId ,
[name],
[lastName],
[address],
bankAccount
FROM Customers
WHERE customerId IN (1,2,3,4)
EXCEPT
SELECT customerId,
[name],
[lastName],
[address],
bankAccount
FROM Customers
WHERE customerId IN (4,5,6,7);
customerId | name | lastName | address | bankAccount |
---|---|---|---|---|
1 | Janice | Haynes | 4822 Radio Park Drive | 111222333 |
2 | Mavin | Pettitt | 2336 Cottonwood Lane | 123456789 |
3 | Peter | Davis | 3608 Sycamore Lake Road | 998344567 |
In this case, only the idCustomers
1,2,3, are returned exclusively by the first query of the union.