Apply Operator in SQL Server Simplified - CodeProject
Lately, I used
APPLY in SQL server that will really help you ease out many of the complex scenarios where you can write complex SQL queries in a simple way. This tip will put some light on
APPLY operator and when it's more preferable over regular
According to definition from https://technet.microsoft.com/en-us/library/ms175156%28v=sql.105%29.aspx:
The APPLY operator allows you to invoke a table-valued function for each row returned by an outer table expression of a query. The table-valued function acts as the right input and the outer table expression acts as the left input. The right input is evaluated for each row from the left input and the rows produced are combined for the final output. The list of columns produced by the APPLY operator is the set of columns in the left input followed by the list of columns returned by the right input.
So, basically it's a row-by-row
join of the left data-set with the right data-set. I will explain its basic syntax and then describe the scenario where
APPLY really helps ease out complexity.
APPLY has two forms,
CROSS APPLY and
OUTER APPLY. It's similar to
JOIN more or less but can do some more that makes it preferable in some scenarios.
APPLY allows a correlated sub-query or table-valued function to be part of the
I read about it from https://technet.microsoft.com/en-us/library/ms175156%28v=sql.105%29.aspx. I just wanted to explain it in a simple way with an example.
Using the Code
-- Here is the code snippet -- Lets start with a simple join and then do the same by Cross-Apply in subsequent query SELECT * FROM Employee E INNER JOIN Department D ON E.DepartmentID = D.DepartmentID --Basic syntax of Cross Apply --The above query can be transformed to Cross apply syntax SELECT * FROM Employee E CROSS APPLY ( SELECT * FROM Department WHERE Department.Departmentid = E.DepartmentID ) D --To get the Departments which could have no employee (LEFT JOIN) SELECT * FROM Department E LEFT JOIN Employee D ON E.DepartmentID = D.DepartmentID -- The above query can be re-written to Outer apply syntax (same effect as a LEFT JOIN above) -- The final result-set contains all the selected columns from the left table-expression -- followed by all the columns of right table expression SELECT * FROM Department D OUTER APPLY ( SELECT * FROM Employee WHERE Employee.Departmentid = D.DepartmentID ) E --Now , you have got the idea of APPLY operator, which is doing exactly the same thing as --JOINS till now , the next thing that comes into our mind is Why Apply when we --already do that with JOIN. --So , the next code snippet will show how can we avoid co-related queries by APPLY --co-related query SELECT EmployeeID ,FirstName ,( SELECT Department.DepartmentID FROM Department WHERE DepartmentID = E.DepartmentID ) AS DepartmentID FROM Employee E --Apply with Table valued function (that accept Department ID and return its all Employees) CREATE FUNCTION dbo.fn_GetAllEmployeeOfADepartment (@DeptID AS INT) RETURNS TABLE AS RETURN ( SELECT * FROM Employee E WHERE E.DepartmentID = @DeptID ) SELECT * FROM Department CROSS APPLY fn_GetAllEmployeeOfADepartment(Department.DepartmentID) -- The above query can not be transformed into Simple Join Query , because in above query -- we are passing DepartmentID as a parameter to the inner TVF ,which cannot be possible by JOINS -- because in that we have two independent Result-sets and the -- execution context of outer query is different from that of inner TVF (table valued func) -- in other words ,you can not bind a value from the outer query to the function as a parameter.. -- THERE ARE SEVERAL OTHER SCENARIOS WHERE APPLY CAN BE USED TO EASE OUT THE COMPLEXITY OF -- WRITING COMPLEX QUERIES -- SUPPOSE WE HAVE 2 TABLES "RUNNERSRECORD" AND "RUNNERS" --If we want the last 5 entries of running log(RunnerRecord) against every runner --we can do this with ROW_NUMBER, PARTITION BY and a nested query: SELECT * FROM ( SELECT R.RunnerID ,FirstName ,LastName ,Email ,Distance ,EntryDate ,ROW_NUMBER() OVER ( PARTITION BY RR.RUNNERID ORDER BY RR.EntryDate DESC ) ROWID FROM Runners R INNER JOIN RunnersRecord rr ON R.RunnerID = RR.RunnerID ) AS T WHERE T.ROWID <= 5
--The same result can be obtained in a much cleaner way with Apply without specifying the --partitioning over runnerid because of row-by-row nature of APPLY operator<span id="cke_bm_92E" style="display: none;"> </span> SELECT * FROM Runners R -- for every row of outer runners table CROSS APPLY ( SELECT TOP 5 * -- select top 5 items of record logs FROM RunnersRecord RR WHERE R.RunnerID = RR.RunnerID ORDER BY RR.EntryDate DESC) RR
-- By using TOP inside a CROSS APPLY statement we can select the TOP 5 items for every row of -- the outer Runners table. -- Also, we don't have to specify the partitioning in case of Cross Apply - you can see we don't -- have RowNum/RowID column in the above query result-set because Apply works row-by-row manner -- As we further move to code snippet, there are other advantages of APPLY operator over normal -- JOINS like creating multi field expressions ( create field "DayMostTravelled" and -- "LargestDistance" SELECT *, ( SELECT TOP 1 RR.EntryDate FROM RunnersRecord RR WHERE V.RunnerID = RR.RunnerID ORDER BY Distance DESC) AS DayMostTravelled , ( SELECT TOP 1 RR.Distance FROM RunnersRecord RR WHERE V.RunnerID = RR.RunnerID ORDER BY Distance DESC ) AS LargestDistance FROM Runners V -- The same result-set can be obtained via Cross apply SELECT * FROM Runners R OUTER APPLY ( SELECT TOP 1 RR.EntryDate AS DayMostTravelled,Distance FROM RunnersRecord RR WHERE R.RunnerID = RR.RunnerID ORDER BY Distance DESC ) CA
-- We can also use expression names created by first Outer apply variable to -- next outer apply variable to form a chaining .. SELECT R.*, C.AvgDistance, C.TotalDistance, D.DistanceLeft FROM Runners R OUTER APPLY ( SELECT Avg(Distance) AS AvgDistance, Sum(Distance) AS TotalDistance FROM RunnersRecord RR WHERE R.RunnerID = RR.RunnerID ) C OUTER APPLY ( --we can add a second outer apply to make a further calculation on the result of first --hence perform same row - logic over multiple section Select 500-C.TotalDistance as 'DistanceLeft' ) D
Apply operator is just not an operator but an extension to SQL server that really helps out write complex logic in a simple query as discussed above. Comparing performance, it's slow due to row-by-row nature. For multi field expression, it's faster (but not that much). It's also useful while joining
TVF and used whenever you think of row-by-row logic. Hope it helps to understand
Apply operator in a very simplified way!
Thank you! Cheers. :)