## What does RANK do?

Like the `ROW_NUMBER`

function introduced last week, `RANK`

is also a window function, i.e. a function that processes the data sorted in so-called “windows” or “partitions”. You define the sorting and the partitions by using the keywords `ORDER BY`

and `PARTITION BY`

after the function call. The `RANK`

function then numbers all the rows in the result set, starting at 1 in ascending order. Unlike the `ROW_NUMBER`

function, the `RANK`

function numbers rows that are in the same position in the sort with the same number. With `ROW_NUMBER`

, a random sort is chosen here and numbered uniquely sequentially.

## How can you work with RANK?

To use the `RANK`

function, we now create a query in the Wide World Importers database that returns sortable items, but where the sort need not be unique. For this we use a CTE to count the number of orders from the OrderLines in which each product appeared:

WITH cte_sales AS ( SELECT StockItemID ,COUNT(*) AS nOrders ,SUM(Quantity*UnitPrice) AS TotalSales FROM Sales.OrderLines GROUP BY StockItemID ) SELECT StockItemID ,nOrders ,TotalSales FROM cte_sales ORDER BY nOrders DESC ,TotalSales DESC

But now we don’t just want to count and display, we want to create a ranking of the most popular products. If we would use the `ROW_NUMBER`

function for this, the two most popular products with the IDs 104 and 120, which have been ordered 1123 times each, would have the numbers 1 and 2, which would have the 1 and which the 2, we could determine, for example, by sorting by TotalSales:

WITH cte_sales AS ( SELECT StockItemID ,COUNT(*) AS nOrders ,SUM(Quantity*UnitPrice) AS TotalSales FROM Sales.OrderLines GROUP BY StockItemID ) SELECT StockItemID ,nOrders ,TotalSales ,ROW_NUMBER() OVER (ORDER BY nOrders DESC, TotalSales DESC) as Popularity FROM cte_sales ORDER BY nOrders DESC ,TotalSales DESC

From a commercial point of view, this additional sorting according to sales totals may be useful, but it does not reflect the actual popularity of the product. If we go purely by sales figures, both products would have to be number one, and this is exactly what the `RANK`

function makes possible. Here we can then also do without the tiebreaker, which distinguished between the first and second line, and accordingly only need an `ORDER BY`

argument:

WITH cte_sales AS ( SELECT StockItemID ,COUNT(*) AS nOrders ,SUM(Quantity*UnitPrice) AS TotalSales FROM Sales.OrderLines GROUP BY StockItemID ) SELECT StockItemID ,nOrders ,TotalSales ,ROW_NUMBER() OVER (ORDER BY nOrders DESC, TotalSales DESC) as Popularity ,RANK() OVER (ORDER BY nOrders DESC) as Real_Popularity FROM cte_sales ORDER BY nOrders DESC ,TotalSales DESC

As a result, you can see that now both products, 104 and 120 have received the (real) popularity 1, while the next product has been assigned the third rank, similar to what is usual in sports competitions. So if you can’t tell from the data which line is the “better” one and want to avoid random numbering, use the `RANK`

function instead of the `ROW_NUMBER`

function. However, it is of course not suitable for sorting out duplicates, since the duplicates would all be assigned the same rank…