Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

A View Is Not a Table

DZone's Guide to

A View Is Not a Table

Don’t shy away from using views, just understand what their real behavior is. A view is a query, not a table. Read on for the full explanation, complete with examples.

· Database Zone
Free Resource

Whether you work in SQL Server Management Studio or Visual Studio, Redgate tools integrate with your existing infrastructure, enabling you to align DevOps for your applications with DevOps for your SQL Server databases. Discover true Database DevOps, brought to you in partnership with Redgate.

In SQL Server, in the T-SQL you use to query it, a view looks just like a table (I’m using the AdventureWorks2014 database for all these examples):

SELECT  *
FROM    Production.vProductAndDescription AS vpad;

 

SELECT  vpad.Name,
        vpad.Description,
        vpmi.Instructions
FROM    Production.vProductAndDescription AS vpad
JOIN    Production.Product AS p
        ON p.ProductID = vpad.ProductID
JOIN    Production.vProductModelInstructions AS vpmi
        ON vpmi.ProductModelID = p.ProductModelID
WHERE   vpad.ProductID = 891
        AND vpad.CultureID = 'fr';

The above query actually combines two views and a table. This is what is commonly referred to as a "code smell." A code smell is a coding practice that works but that can lead to problems. In this case, we’re talking about performance problems. The performance problems when using views to join to tables and other views as if they were real tables comes about because a standard view is not a table. It's a query. For example, the second view introduced, vPorductModelInstructions looks like this:

ALTER VIEW [Production].[vProductModelInstructions] 
AS 
SELECT 
    [ProductModelID] 
    ,[Name] 
    ,[Instructions].value(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
        (/root/text())[1]', 'nvarchar(max)') AS [Instructions] 
    ,[MfgInstructions].ref.value('@LocationID[1]', 'int') AS [LocationID] 
    ,[MfgInstructions].ref.value('@SetupHours[1]', 'decimal(9, 4)') AS [SetupHours] 
    ,[MfgInstructions].ref.value('@MachineHours[1]', 'decimal(9, 4)') AS [MachineHours] 
    ,[MfgInstructions].ref.value('@LaborHours[1]', 'decimal(9, 4)') AS [LaborHours] 
    ,[MfgInstructions].ref.value('@LotSize[1]', 'int') AS [LotSize] 
    ,[Steps].ref.value('string(.)[1]', 'nvarchar(1024)') AS [Step] 
    ,[rowguid] 
    ,[ModifiedDate]
FROM [Production].[ProductModel] 
CROSS APPLY [Instructions].nodes(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
    /root/Location') MfgInstructions(ref)
CROSS APPLY [MfgInstructions].ref.nodes('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
    step') Steps(ref);

GO

That’s a query against the XML stored in the ProductModel table. The view was created to mask the complexity of the necessary XPath code while providing a mechanism for retrieving the data from the XML. This is a common use of views. However, when we then treat the view as a table and join it to other tables and views, we present a problem for the optimizer. Because a view is not a table but is instead a query, the optimizer has to resolve this query in combination with any other views or tables to arrive at an execution plan for the whole combined mess. While the optimizer is very good at what it does, because of the complexity caused by the additional unnecessary processing to figure out which parts of the view is not needed to satisfy the query, it can make poor choices. That can result in poor performance.

If I were to rewrite the query, it would look something like this:

SELECT  p.Name,
        pd.Description,
        pm.Instructions.value(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
        (/root/text())[1]', 'nvarchar(max)') AS Instructions
FROM    Production.Product AS p
JOIN    Production.ProductModelProductDescriptionCulture AS pmpdc
        ON pmpdc.ProductModelID = p.ProductModelID
JOIN    Production.ProductDescription AS pd
        ON pd.ProductDescriptionID = pmpdc.ProductDescriptionID
JOIN    Production.ProductModel AS pm
        ON pm.ProductModelID = p.ProductModelID
CROSS APPLY Instructions.nodes(N'declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
    /root/Location') MfgInstructions (ref)
CROSS APPLY MfgInstructions.ref.nodes('declare default element namespace "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions"; 
    step') Steps (ref)
WHERE   p.ProductID = 891
        AND pmpdc.CultureID = 'fr';

That’s a lot more complex than the query we had above that only referenced three objects and had only two JOIN operations. However, if you capture the I/O and the execution time for these queries, you’ll see a different story.

I used two methods for measuring performance. I used SET STATISTICS IO and SET STATISTICS TIME to ON for the queries for one set of tests. For another set, I used Extended Events. Consistently the execution time for the query with the view was around 110ms. The query that didn’t reference any views was around 37ms. The reads were 155 for the query with views, but only 109 for the query without. If you look at the individual table I/O, you can start to see where the differences come from. These are the results from the query with the views:

Image title

These are the results for the query without the view:

Table 'ProductDescription'. Scan count 0, logical reads 56
Table 'xml_index_nodes_418100530_256001'. Scan count 13, logical reads 37
Table 'ProductModelProductDescriptionCulture'. Scan count 6, logical reads 12
Table 'ProductModel'. Scan count 0, logical reads 2
Table 'Product'. Scan count 0, logical reads 2

You can see the differences in both ProductModelProductDescriptionCulture and Product. This is because of the differences in the execution plans caused by differences in the choices made by the optimizer.

A standard view is not a table. There is such a thing as a materialized view or indexed view, which is a table. That’s not what we’re talking about here. While you can use a view as if it was a table, don’t mistake it for a table. A view is just a mask in front of a query. It can’t be used like an object so that you avoid rewriting the same JOIN. That will lead to issues for the optimizer as this simple set of examples showed. Don’t shy away from using views, just understand what their real behavior is. A view is a query, not a table.

It’s easier than you think to extend DevOps practices to SQL Server with Redgate tools. Discover how to introduce true Database DevOps, brought to you in partnership with Redgate

Topics:
sql ,view ,table ,database

Published at DZone with permission of Grant Fritchey, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}