Analytical Functions: OVER and PARTITION BY

Oracle SQL functions can be categorized into three main types: scalar, aggregate, and analytical functions. Scalar functions take one row at a time and a single result for each row processed. On the other hand, aggregate functions are applied to multiple rows and return a single result.

Analytical functions reside somewhere between these two categories. They operate on a set of rows related to current row. These functions allow you to perform calculations across a specific range of rows, known as a "window," without grouping the result set. Here, a window can be defined by the number of rows or by any other logic, such as a time range. As the processing moves from one row to another, the window may slide, depending on how the analytical function specifies the window. 

In terms of logical order of data processing steps in a SELECT statement, analytical functions fall between SELECT and ORDER BY clause. That is, Oracle will firstly fetch data defined in FROM clause, narrow down by WHERE clause, remain only listed columns in SELECT clause, then handle an analytical functions. For this reason, you cannot include any analytic functions anywhere other than SELECT list or the ORDER BY clause: they are not allowed in the WHERE, HAVING, and GROUP BY clause.

For example, let's consider a table shown in the figure below:

Table above has four different columns to store and manage Pokémon experience values that I've gained from a game. So, how we can summarize this table? Well, we can use the SUM as an aggregate function to obtain the total amount of experiences. For example:

This query retrieves the pokemon_id and name columns from the joined tables and adds an additional column, "Total Exp. Gained," which represents the total sum of experience_gained over the records in the window specified by the OVER clause. Here, I didn't add any window logic in the OVER clause. Thus, it will sum over all experience_gained column values throughout the table.

What if we want the total experience gained by each Pokémon? We can add a partitioning logic with the PARTITION BY keyword. For example:

This time, the sum is calculated within partitioning windows defined by the column. The PARTITION BY clause creates partitions based on the values in the specified column. Consequently, the sum is computed separately for each partition, allowing for separate calculations for each Pokémon.

We can go further and calculate percentages within each partition. For example:

This SQL query summarize the total experience gained within partitions based on Pokémon names, and then calculates the percentage of experience gained relative to the total experience for each Pokémon name. 

Lastly, let's see how it works if we specify more than one columns in the PARTITION BY clause. When there are multiple columns in the PARTITION BY clause, we are partitioning the result set based on each distinct combinations of column values. This means that the analytic functions, such as SUM or AVG, will be applied separately for each unique combination of the specified columns. For example:

Here, PARTITION BY, exp.gained_from creates partitioning window based on unique combinations of the two column values: ('Eevee', 'Trainer battle'), ('Pikachu', 'Trainer battle'), ('Pikachu', 'Wild'). The SUM function is then applied within each of these partitions, calculating the sum of experience_gained separately.

Post a Comment