Group Functions in SQL
Objectives
This tutorial covers the following objectives:
- Define and give an example of the seven group functions: SUM, AVG, COUNT, MIN, MAX, STDDEV, VARIANCE
- Construct and execute a SQL query using group functions
- Construct and execute group functions that operate only with numeric data types
Purpose of Group Functions in SQL
- What if you were writing an article for the school newspaper and, to make a point, you wanted to know the average age of the students at your school?
- What would you have to do to get this information?
- You could ask each student their age in years, months, and days, add up all of these numbers, and then divide by the number of students in your school.
- That would be one way — a very slow and difficult way — to find this information.
- What if you needed to know this immediately so that you could meet a 3:00 p.m. deadline?
- You might have a problem!
- What if each student’s date of birth was in a school database in the STUDENT table?
- It would be so easy then!
- In this lesson, you are going to learn about the power of group functions in SQL.
GROUP Functions
- In SQL, the following group functions can operate on a whole table or on a specific grouping of rows.
- Each function returns one result.
- Group Functions:
- AVG
- COUNT
- MIN
- MAX
- SUM
- VARIANCE
- STDDEV
The COUNT group function will be examined in more detail in the next lesson.
GROUP Functions List
- MIN: Used with columns that store any data type to return the minimum value.
- MAX: Used with columns that store any data type to return the maximum value.

- SUM: Used with columns that store numeric data to find the total or sum of values.
- AVG: Used with columns that store numeric data to compute the average.
- COUNT: Returns the number of rows.
- VARIANCE: Used with columns that store numeric data to calculate the spread of data around the mean.
- For example, if the average grade for the class on the last test was 82% and the student’s scores ranged from 40% to 100%, the variance of scores would be greater than if the student’s scores ranged from 78% to 88%.
- STDDEV: Similar to variance, standard deviation measures the spread of data.
- For two sets of data with approximately the same mean, the greater the spread, the greater the standard deviation
Standard deviation and variance measure the spread of data around the mean or average. For the purpose of this course, it is important to be able to recognize and use them as group functions. Understanding how they work is beyond the scope of this course.
GROUP Functions SELECT Clause
- Group functions are written in the SELECT clause:
SELECT column, group_function(column),
..
FROM table
WHERE condition
GROUP BY column;

- What are Group Functions?
- Group Functions operate on sets of rows to give one result per group.
The WHERE clause can be included to perform a group function on a sub-set of the table, for example WHERE department_id = 90.
The GROUP BY clause will be discussed in a later lesson.
GROUP Function Cautions
- Important things you should know about group functions:
- Group functions cannot be used in the WHERE clause:

GROUP Function examples
- MIN: Used with columns that store any data type to return the minimum value.

- Example 1 returns the lowest number in the life_expect_at_birth column.
- Example 2 uses a character data column, and returns the county whose name is first in the alphabetic list of country names.
- Example 3 uses a date data type column and returns the earliest hire date.
- MAX: Used with columns that store any data type to return the maximum value.

- Example 1 returns the highest number in the life_expect_at_birth column.
- Example 2 uses a character data column, and returns the county whose name is last in the alphabetic list of country names.
- Example 3 uses a date data type column and returns the most recent hire date.
SUM: Used with columns that store numeric data to find the total or sum of values.

You can restrict the group function to a subset of the table using a WHERE clause.
- Example 1 returns the total (sum) of all the areas of countries in region 29 (Caribbean).
- Example 2 returns the total salary for employees in department 90.
- AVG: Used with columns that store numeric data to compute the average.

- Example 1 returns the average of all the areas of countries in region 29 (Caribbean).
- Example 2 returns the average salary for employees in department 90, rounded to two decimal places.
- VARIANCE: Used with columns that store numeric data to calculate the spread of data around the mean.
- STDDEV: Similar to variance, standard deviation measures the spread of data.

GROUP Function and NULL
- Group functions ignore NULL values.
- In the example below, the null values were not used to find the average commission_pct.
SELECT AVG(commission_pct)
FROM employees;


The employees table has 20 rows. Only 4 employees have a commission_pct, the other 16 rows contain NULL. The average is calculated by finding the SUM of the not-null rows, and dividing by the COUNT of the not null rows.
This topic will be covered in more depth in the next lesson.
More Than One Group Function
- You can have more than one group function in the SELECT clause, on the same or different columns.
SELECT MAX(salary), MIN(salary), MIN(employee_id)
FROM employees
WHERE department_id = 60;

Rules for Group Functions
- Group functions ignore null values.
- Group functions cannot be used in the WHERE clause.
- MIN, MAX and COUNT can be used with any data type; SUM, AVG, STDDEV, and VARIANCE can be used only with numeric data types.
Terminology
Key terms used in this tutorial included:
- AVG
- COUNT
- Group functions
- MAX
- MIN
- STDDEV
- SUM
- VARIANCE
Summary
In this tutorial, you should have learned how to:
- Define and give an example of the seven group functions: SUM, AVG, COUNT, MIN, MAX, STDDEV, VARIANCE
- Construct and execute a SQL query using group functions
- Construct and execute group functions that operate only with numeric data types
0 Comments