Apply Operator in SQL Server Simplified - CodeProject

:

Introduction

Lately, I used APPLY in SQL server that will really help you ease out many of the complex scenarios where you can write complex SQL queries in a simple way. This tip will put some light on APPLY operator and when it's more preferable over regular join.

Quote:

According to definition from https://technet.microsoft.com/en-us/library/ms175156%28v=sql.105%29.aspx:

The APPLY operator allows you to invoke a table-valued function for each row returned by an outer table expression of a query. The table-valued function acts as the right input and the outer table expression acts as the left input. The right input is evaluated for each row from the left input and the rows produced are combined for the final output. The list of columns produced by the APPLY operator is the set of columns in the left input followed by the list of columns returned by the right input.

So, basically it's a row-by-row join of the left data-set with the right data-set. I will explain its basic syntax and then describe the scenario where APPLY really helps ease out complexity.

Basically APPLY has two forms, CROSS APPLY and OUTER APPLY. It's similar to JOIN more or less but can do some more that makes it preferable in some scenarios.
APPLY allows a correlated sub-query or table-valued function to be part of the FROM clause.

Background

I read about it from https://technet.microsoft.com/en-us/library/ms175156%28v=sql.105%29.aspx. I just wanted to explain it in a simple way with an example.

Using the Code

-- Here is the code snippet
-- Lets start with a simple join and then do the same by Cross-Apply in subsequent query
SELECT *
FROM Employee E
INNER JOIN Department D ON E.DepartmentID = D.DepartmentID

--Basic syntax of Cross Apply
--The above query can be transformed to Cross apply syntax
SELECT *
FROM Employee E
CROSS APPLY (
    SELECT *
    FROM Department
    WHERE Department.Departmentid = E.DepartmentID
    ) D

--To get the Departments which could have no employee (LEFT JOIN)
SELECT *
FROM Department E
LEFT JOIN Employee D ON E.DepartmentID = D.DepartmentID

-- The above query can be re-written to Outer apply syntax (same effect as a LEFT JOIN above)
-- The final result-set contains all the selected columns from the left table-expression
-- followed by all the columns of right table expression
SELECT *
FROM Department D
OUTER APPLY (
    SELECT *
    FROM Employee
    WHERE Employee.Departmentid = D.DepartmentID
    ) E

--Now , you have got the idea of APPLY operator, which is doing exactly the same thing as
--JOINS till now , the next thing that comes into our mind is Why Apply when we
--already do that with JOIN.
--So , the next code snippet will show how can we avoid co-related queries by APPLY
--co-related query
SELECT EmployeeID
    ,FirstName
    ,(
        SELECT Department.DepartmentID
        FROM Department
        WHERE DepartmentID = E.DepartmentID
        ) AS DepartmentID
FROM Employee E

--Apply with Table valued function (that accept Department ID and return its all Employees)
CREATE FUNCTION dbo.fn_GetAllEmployeeOfADepartment (@DeptID AS INT)
RETURNS TABLE
AS
RETURN (
        SELECT *
        FROM Employee E
        WHERE E.DepartmentID = @DeptID
        )

SELECT *
FROM Department
CROSS APPLY fn_GetAllEmployeeOfADepartment(Department.DepartmentID)

-- The above query can not be transformed into Simple Join Query , because in above query
-- we are passing DepartmentID as a parameter to the inner TVF ,which cannot be possible by JOINS
-- because in that we have two independent Result-sets and the
-- execution context of outer query is different from that of inner TVF (table valued func)
-- in other words ,you can not bind a value from the outer query to the function as a parameter..

-- THERE ARE SEVERAL OTHER SCENARIOS WHERE APPLY CAN BE USED TO EASE OUT THE COMPLEXITY OF
-- WRITING COMPLEX QUERIES

-- SUPPOSE WE HAVE 2 TABLES "RUNNERSRECORD" AND "RUNNERS"


--If we want the last 5 entries of running log(RunnerRecord) against every runner
--we can do this with ROW_NUMBER, PARTITION BY and a nested query:

SELECT *
FROM (
SELECT R.RunnerID
  ,FirstName
  ,LastName
  ,Email
  ,Distance
  ,EntryDate
  ,ROW_NUMBER() OVER (
   PARTITION BY RR.RUNNERID ORDER BY RR.EntryDate DESC
   ) ROWID
FROM Runners R
INNER JOIN RunnersRecord rr ON R.RunnerID = RR.RunnerID
) AS T
WHERE T.ROWID <= 5

--The same result can be obtained in a much cleaner way with Apply without specifying the
--partitioning over runnerid because of row-by-row nature of APPLY operator<span id="cke_bm_92E" style="display: none;"> </span>

SELECT *
FROM Runners R                 -- for every row of outer runners table        
CROSS APPLY (
  SELECT TOP 5 *               -- select top 5 items of record logs
  FROM RunnersRecord RR
  WHERE R.RunnerID = RR.RunnerID
  ORDER BY RR.EntryDate DESC) RR

-- By using TOP inside a CROSS APPLY statement we can select the TOP 5 items for every row of
-- the outer Runners table.  
-- Also, we don't have to specify the partitioning in case of Cross Apply - you can see we don't 
-- have RowNum/RowID column in the above query result-set because Apply works row-by-row manner


-- As we further move to code snippet, there are other advantages of APPLY operator over normal 
-- JOINS like creating multi field expressions ( create field "DayMostTravelled" and
-- "LargestDistance"

SELECT *, (
  SELECT TOP 1 RR.EntryDate
  FROM RunnersRecord RR
  WHERE V.RunnerID = RR.RunnerID ORDER BY Distance DESC) AS DayMostTravelled ,
   (
  SELECT TOP 1 RR.Distance
  FROM RunnersRecord RR
  WHERE V.RunnerID = RR.RunnerID ORDER BY Distance DESC
   ) AS LargestDistance

FROM Runners V 


-- The same result-set can be obtained via Cross apply
SELECT *
FROM Runners R
OUTER APPLY (
  SELECT TOP 1 RR.EntryDate AS DayMostTravelled,Distance
  FROM RunnersRecord RR
  WHERE R.RunnerID = RR.RunnerID
  ORDER BY Distance DESC
) CA  

-- We can also use expression names created by first Outer apply variable to 
-- next outer apply variable to form a chaining ..


SELECT R.*, C.AvgDistance, C.TotalDistance, D.DistanceLeft
FROM Runners R
OUTER APPLY (
   SELECT Avg(Distance) AS AvgDistance, Sum(Distance) AS TotalDistance
   FROM RunnersRecord RR
   WHERE R.RunnerID = RR.RunnerID
) C
OUTER APPLY
(  
   --we can add a second outer apply to make a further calculation on the result of first
   --hence perform same row - logic over multiple section
   Select 500-C.TotalDistance as 'DistanceLeft'
) D

Conclusion

Apply operator is just not an operator but an extension to SQL server that really helps out write complex logic in a simple query as discussed above. Comparing performance, it's slow due to row-by-row nature. For multi field expression, it's faster (but not that much). It's also useful while joining TVF and used whenever you think of row-by-row logic. Hope it helps to understand Apply operator in a very simplified way!

Thank you! Cheers. :)