GROUP BY statement and aggregate functions in SQL

The aggregate functions are functions that allow you to perform operations on a result set, returning a single aggregate value. They are often used in conjunction with the GROUP BY clause.

COUNT

The COUNT function, like the rest of aggregate functions, is always placed in the SELECT before the FROM. It is used to count the number of records. If it does not carry any condition, it will return the total number of records.

SELECT COUNT(*) totalCustomers
FROM Customers;
totalCustomers
7

The result of the statement tells us that the Customers table currently consists of a total of 7 records.

The use of aggregate functions does not exempt the use of the other clauses of a SELECT. In addition, they are not limited to a single function per statement, and several can be applied in the same query.

Next, we will see another simple example with the Orders table:

SELECT COUNT (product) AS totalAlexa
FROM Orders
WHERE product LIKE '%ECHO%';
totalAlexa
2

The output tells us that the Orders table contains two Alexa ECHO devices.

In the aggregate functions,it is recommended to always use an alias for the definition of the columns that store said values of the result set.

MAX

The MAX function returns the maximum value of the specified field:

SELECT MAX(price) maxPrice
FROM Orders;
maxPrice
2449,5

This query tells us that our most expensive product in the Orders table has a value of: 2449.50 euros.

MIN

The MIN function returns the minimum value of the specified field:

SELECT MIN(price) minPrice
FROM Orders;
minPrice
39,99

This query tells us that our cheapest product in the Orders table has a value of: 39.99 euros.

SUM

The SUM function returns the total value of the sum of a column. This function is only applicable to numeric fields, and NULL values are ignored.

SELECT SUM(totalPrice) Profit
FROM Orders;
Profit
17251,6

Through this query we quickly access the final result of the sum of all the total prices of the Orders table.

AVG

The AVG function returns the average value of the specified field. This function is only applicable to numeric fields and NULL values will be ignored.

SELECT AVG(price) averagePrice
FROM Orders;
averagePrice
610,73875

This query tells us the value of the average price from the Orders table.

For more information on aggregate functions, you can visit the Microsoft documentation

GROUP BY

The GROUP BY clause is used to group the results according to the selected column, returning a row of results for each group of said column. GROUP BY is often used with aggregate functions.

As for the syntax, the GROUP BY clause is placed at the end of the statement, after WHERE but before ORDER BY, in the case that both clauses are applied.

First, we run the following SELECT example without adding a GROUP BY clause:

SELECT C.customerId,
       C.name,
       C.lastName
FROM Customers C
INNER JOIN Orders O
ON C.customerId = O.customerId
ORDER BY C.customerId ASC;
customerId name lastName
1 Janice Haynes
2 Mavin Pettitt
2 Mavin Pettitt
3 Peter Davis
4 Helen Ward
4 Helen Ward
5 Kimberly Lee

When establishing an INNER JOIN relationship between the Customers and Orders table, this SELECT allows us to see which Customers have placed an order, therefore, the customerId is present in the Orders table as many times as orders were placed.

Now, we will run the same query with GROUP BY to see the difference in the execution result:

SELECT C.customerId,
       C.name,
       C.lastName
FROM Customers C
INNER JOIN Orders O
ON C.customerId = O.customerId
GROUP BY C.customerId,
         C.name,
         C.lastName
ORDER BY C.customerId ASC;
customerId name lastName
1 Janice Haynes
2 Mavin Pettitt
3 Peter Davis
4 Helen Ward
5 Kimberly Lee

As we can see, now the records of the same value have been grouped, eliminating duplicity in the query.

As mentioned above, GROUP BY is often used in conjunction with aggregate functions to extend the information returned in the query:

SELECT C.customerId,
       C.name,
       C.lastName,
       COUNT (*) numberOrders,
       SUM (totalPrice) customerTotalSpending
FROM Customers C
INNER JOIN Orders O
ON C.customerId = o.customerId
GROUP BY C.customerId,
         C.name,
         C.lastName;
customerId name lastName numberOrders customerTotalSpending
1 Janice Haynes 1 2449,5
2 Mavin Pettit 2 3199,78
3 Peter Davis 1 4304,25
4 Helen Ward 2 4999
5 Kimberly Lee 1 899,97

Thanks to the aggregated functions, we can check the number of orders made by each client and the Total amount spent according to the number of orders made.