T-SQL Ninja #31

LEAD and LAG

What are LEAD and LAG?

Last week you saw how you use FIRST_VALUE and LAST_VALUE to give the first and last rows in a query window. But what do you do if you want to compare a row in the Sales.Order table with the customer’s previous order? Then LEAD and LAG are the means of choice. LEAD returns the next row in the results window, whereas LAG returns the previous row.

How can you work with LEAD and LAG?

We start like last week with a query on the Sale.Order table and look at some columns:

SELECT 
     OrderDate
    ,OrderID
    ,ExpectedDeliveryDate
    ,CustomerID
FROM [Sales].[Orders]

Now we add another column in which we use the LAG function to return the date of the previous row in the window partitioned by customer. The syntax is very similar to that of FIRST_VALUE and LAST_VALUE, but the LAG function (just like the LEAD function) additionally gets an integer as an argument that specifies how many rows you want to go back (in the example, we go back one row, passing 1 as an argument):

SELECT 
     OrderDate
    ,LAG(OrderDate, 1) OVER (PARTITION BY CustomerID ORDER BY OrderDate) As DatePreviousOrder
    ,OrderID
    ,ExpectedDeliveryDate
    ,CustomerID
FROM [Sales].[Orders]

The first line for each customer has no entry in the newly calculated DatePreviousOrder column (i.e. NULL as value), which is due to the fact that the first order of each customer naturally has no predecessor.

Like FIRST_VALUE and LAST_VALUE, you can of course use LEAD and LAG in functions:

SELECT 
     OrderDate
    ,LAG(OrderDate, 1) OVER (PARTITION BY CustomerID ORDER BY OrderDate) As DatePreviousOrder
    ,DATEDIFF(DAY, LAG(OrderDate, 1) OVER (PARTITION BY CustomerID ORDER BY OrderDate), OrderDate) As DaysSincePreviousOrder
    ,OrderID
    ,ExpectedDeliveryDate
    ,CustomerID
FROM [Sales].[Orders]

With LEAD the query works the same way except that here the last row has no value in the recalculated column:

SELECT 
     OrderDate
    ,LAG(OrderDate, 1) OVER (PARTITION BY CustomerID ORDER BY OrderDate) As DatePreviousOrder
    ,DATEDIFF(DAY, LAG(OrderDate, 1) OVER (PARTITION BY CustomerID ORDER BY OrderDate), OrderDate) As DaysSincePreviousOrder
    ,LEAD(OrderDate, 1) OVER (PARTITION BY CustomerID ORDER BY OrderDate) As DateNextOrder
    ,DATEDIFF(DAY, OrderDate, LEAD(OrderDate, 1) OVER (PARTITION BY CustomerID ORDER BY OrderDate)) As DaysUnitlNextOrder
    ,OrderID
    ,ExpectedDeliveryDate
    ,CustomerID
FROM [Sales].[Orders]

So now you have the tool in your hand to carry out analyses in tables with time-dependent data, you can find out when a recording begins and also how it has changed compared to the previous value. There is nothing standing in the way of the evaluation of orders, measured values or other data series even outside of this dojo. Get them, ninjas!

References

Ninja-Notebooks @ GitHub

Leave a Reply

Your email address will not be published. Required fields are marked *