PostgreSQL Aggregate Functions
PostgreSQL Aggregate Functions
PostgreSQL has a range of aggregate functions that can be highly useful for summarizing data.
Here's a list of some commonly used ones along with a few that may be less familiar:
1. Basic Aggregate Functions
- SUM() - Calculates the sum of values.
Example: SELECT SUM(salary) FROM employees;
- AVG() - Calculates the average value.
Example: SELECT AVG(salary) FROM employees;
- MIN() - Finds the minimum value.
Example: SELECT MIN(age) FROM employees;
- MAX() - Finds the maximum value.
Example: SELECT MAX(age) FROM employees;
- COUNT() - Counts the number of rows.
Example: SELECT COUNT(*) FROM employees;
- COUNT(DISTINCT column) - Counts distinct values in a column.
Example: SELECT COUNT(DISTINCT department) FROM employees;
2. Statistical Aggregate Functions
- VARIANCE() - Calculates the variance of a set of values.
Example: SELECT VARIANCE(salary) FROM employees;
- STDDEV() - Calculates the standard deviation of values.
Example: SELECT STDDEV(salary) FROM employees;
- MODE() - Returns the most frequent value (requires tablefunc extension).
Example: SELECT MODE() WITHIN GROUP (ORDER BY salary) FROM employees;
3. Advanced Aggregate Functions
- ARRAY_AGG() - Aggregates values as an array.
Example: SELECT ARRAY_AGG(department) FROM employees;
- STRING_AGG() - Concatenates strings with a delimiter.
Example: SELECT STRING_AGG(name, ', ') FROM employees;
- BOOL_AND() - Checks if all values are true.
Example: SELECT BOOL_AND(is_active) FROM employees;
- BOOL_OR() - Checks if any value is true.
Example: SELECT BOOL_OR(is_active) FROM employees;
- JSON_AGG() - Aggregates values into a JSON array.
Example: SELECT JSON_AGG(row_to_json(employees)) FROM employees;
- XMLAGG() - Concatenates XML data.
Example: SELECT XMLAGG(XMLELEMENT(name, name)) FROM employees;
4. Quantile Aggregate Functions (Percentile)
- PERCENTILE_CONT() - Computes a percentile value (continuous distribution).
Example: SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) FROM
employees;
- PERCENTILE_DISC() - Computes a percentile value (discrete distribution).
Example: SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY salary) FROM
employees;
5. Hypothetical Set Functions
- RANK() - Returns the rank of each row within a partition.
- DENSE_RANK() - Like RANK() but without gaps in ranking.
- CUME_DIST() - Cumulative distribution of a value.
- PERCENT_RANK() - Relative rank of a row.
These functions add more flexibility and allow for nuanced data summaries in PostgreSQL queries.
The percentile and hypothetical set functions are particularly useful for statistical analysis.