Effortless documentation for effective data teams

DataDocs is a data catalog that saves you time by writing documentation for you

Back to Blog

What are Common Table Expressions (CTEs) in SQL?

Dive into Common Table Expressions and how they can be used to simplify your SQL queries.
Dive into Common Table Expressions and how they can be used to simplify your SQL queries.

In SQL database management, Common Table Expressions or CTEs play a pivotal role. This blog post aims to break down the concept of CTEs and explore their applicability, benefits, and some of the associated challenges.

CTE Basics

Let's start with what CTEs are. Common Table Expressions, aptly named, function as temporary result sets that are defined within the execution of a single SQL statement. Think of a CTE as a means to create a named temporary result set that can only be referenced later within that same statement, kind of like a shortcut. This "shortcut" enables us to regulate the readability and manageability of more complicated queries by dividing them into simpler, segmented parts.

-- Defining a CTE
WITH Sales_CTE (SalesPersonID, NumberOfOrders)
AS
(
  -- Defining a simple CTE query
  SELECT SalesPersonID, COUNT(OrderID)
  FROM SalesOrderHeader
  GROUP BY SalesPersonID
)
-- Using the CTE in a query
SELECT E.EmployeeID, E.FirstName, E.LastName, S.NumberOfOrders
FROM Employee AS E
JOIN Sales_CTE AS S
ON E.EmployeeID = S.SalesPersonID

CTEs can be used in multiple ways and their use depends on the complexity of the SQL statement. One way CTEs are used is as a substitute for a view when the general use of a view isn't needed—you can think of it as a 'one-time' view. They prove their true potential when used along with SELECT, INSERT, UPDATE or DELETE statements.

Recursive CTEs

Next, let's understand the different types of CTEs. SQL majorly houses two types of CTEs – non-recursive and recursive. Non-recursive CTEs are, as the name suggests, those that do not refer to themselves. Recursive CTEs, on the other hand, are iterative, referring to themselves and are thus able to perform a series of tasks repetitively—a trait absent in the non-recursive variant.

-- Recursive CTE to display hierarchical data
WITH EmployeeHierarchy (EmployeeID, FirstName, LastName, ManagerID, HierarchyLevel)
AS
(
-- Anchor member (initial level of recursion – base result set)
  SELECT EmployeeID, FirstName, LastName, ManagerID, 1 as HierarchyLevel
  FROM Employee
  WHERE ManagerID IS NULL -- Top level management who don't have a manager.
UNION ALL
-- Recursive member (next level of recursion – adds to base result set)
  SELECT E.EmployeeID, E.FirstName, E.LastName, E.ManagerID, HierarchyLevel + 1
  FROM Employee AS E
  JOIN EmployeeHierarchy AS EH
  ON E.ManagerID = EH.EmployeeID
)
SELECT * FROM EmployeeHierarchy

Advantages of CTEs

Diving into the advantages of CTEs, one major reason why someone might choose CTE over a subquery or a derived table is its readability and maintainability. Working with embedded SELECT statements can become a challenge when dealing with a complex query. With CTEs, one can simplify these statements, make them readable, and reduce redundancy as it can be referred to multiple times in the same query.

CTEs also provide the advantage of improving performance in some situations where the subquery is not a viable or easy option. It is constructive, for instance, when working with advanced window functions, partitioning data into smaller units, or spreading computations across individual calculation steps.

WITH OrderSummary_CTE AS (
  SELECT CustomerID,
         OrderID,
         SUM(UnitPrice*Quantity) AS Total,
         ROW_NUMBER() OVER(PARTITION BY CustomerID ORDER BY OrderID DESC) AS LatestOrderRank
  FROM Orders O
  JOIN OrderDetail OD
  ON O.OrderID = OD.OrderID
  GROUP BY CustomerID, OrderID
)
 
SELECT *
FROM OrderSummary_CTE
WHERE LatestOrderRank = 1

Challenges of CTEs

Nonetheless, despite their benefits, there are some challenges associated with CTEs. Understanding recursive CTEs may require a substantial learning curve – the recursive functionality can be challenging to comprehend initially. Also, while CTEs are a great tool, in some DBMS they may not provide performance advantages in every situation, and could sometimes even hamper performance owing to the resource-intensive nature of large-scale recursion.

To top it all, it is essential to understand that while CTEs and temporary tables may sound like the same thing, they are not. Temporary tables are stored on the disk, existing until they are dropped, while a CTE is, by essence, more temporary, given it lasts only within a single SQL statement execution. This should be taken into account while choosing between the two, as one provides persistence while the other offers simplicity.

Conclusion

Mastering CTEs in SQL comes with experience and practice. The concept, while simple on the surface, offers much profound functionality when delved into. The quality of SQL code can be greatly improved with a solid understanding of CTEs and their smart usage.

While it may not sound as exciting as other advanced functions, bear in mind that in SQL, the knowledge of foundational elements like CTEs constitutes the bedrock for more complex tasks. Whether you're innovating in a dynamic startup or driving data analytics in a digital marketing agency, understanding these underlying principles will ensure your journey with SQL becomes much smoother and efficient.

Let's get started!

DataDocs offers always up-to-date, automated documentation and cataloging for your database.