Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Finding Code Smells Using SQL Prompt: Old-Style Join Syntax (ST001)

DZone's Guide to

Finding Code Smells Using SQL Prompt: Old-Style Join Syntax (ST001)

There are no advantages to leaving old-style join syntax in legacy code. If you spot this code smell, you should rewrite the statements to use ANSI-standard join syntax.

· Database Zone ·
Free Resource

New whitepaper: Database DevOps – 6 Tips for Achieving Continuous Delivery. Discover 6 tips for continuous delivery with Database DevOps in this new whitepaper from Redgate. In 9 pages, it covers version control for databases and configurations, branching and testing, automation, using NuGet packages, and advice for how to start a pioneering Database DevOps project. Also includes further research on the industry-wide state of Database DevOps, how application and database development compare, plus practical steps for bringing DevOps to your database. Read it now free.

SQL Prompt implements a static code analysis rule, ST001, which will check code automatically during development and testing work for occurrences of non-ANSI standard JOIN syntax.

The "old style" Microsoft/Sybase JOIN style for SQL, which uses the =* and *= syntax, has been deprecated and is no longer used. Queries that use this syntax will fail when the database engine level is 10 (SQL Server 2008) or later (compatibility level 100). The ANSI-89 table citation list (FROM tableA, tableB) is still ISO standard for INNER JOINs only. Neither of these styles is worth using. It is always better to specify the type of join you require, INNER, LEFTOUTER, RIGHTOUTER, FULL OUTER, and CROSS, which has been standard since ANSI SQL-92 was published. While you can choose any supported JOIN style without affecting the query plan used by SQL Server, using the ANSI-standard syntax will make your code easier to understand, more consistent, and portable to other relational database systems.

Old-Style Outer Joins Are Obsolete

When SQL Server forked from Sybase, it inherited its old non-standard Transact-SQL syntax for joins, which included the and = syntax, for left and right outer joins, respectively.

The left outer join operator, =, selected from the first table (the "outer member" of the outer join) all rows that met the statement's restrictions. The second table ("inner member") generated values only if there is a match on the join condition for that row; otherwise, it provided null values. Conversely, for the right outer join operator, =, the second table became the "outer member," from which all rows were selected that met the criteria.

There were restrictions on this syntax, even when they were supported. You couldn't include a Transact-SQL outer join in a HAVING clause, and you couldn't do an additional INNER JOIN in the same expression as an old-style outer join. In addition, the outer join syntax (= or =) does not always give the correct results, sometimes using a cross join when an outer join is specified.

In any event, this syntax was deprecated in SQL Server 2005 onwards, and stopped working is SQL Server 2008. The intent of Listing 1, which queries the pubs database, is to return all titles that don't have a corresponding author.

--all titles without an author
SELECT ti.title + Coalesce( ' (' + ti.type + ') -' + ti.pub_id,' (Unknown category)') AS publication
  FROM dbo.titles AS ti, dbo.titleauthor AS Ta
     where ti.title_id *= Ta.title_id
     AND Ta.title_id IS NULL;

Listing 1

However, it will fail in SQL Server 2008 or later unless you set the compatibility level to 90. This setting was only possible up to SQL Server 2012:

Msg 102, Level 15, State 1, Line 6
Incorrect syntax near '*='.

If you still have queries that use this syntax, you'll have to rewrite them to use the ANSI standard syntax, shown in Listing 2, before upgrading beyond SQL Server 2012, because the compatibility level 90 is no longer supported.

--all titles without an author
SELECT ti.title + Coalesce( ' (' + ti.type + ') -' + ti.pub_id,' (Unknown category)') AS publication
  FROM dbo.titles AS ti, dbo.titleauthor AS Ta
     where ti.title_id *= Ta.title_id
     AND Ta.title_id IS NULL;

Listing 2

This gives the following result:

Old-Style Inner Joins Are Supported but Offer No Advantages

This table citation syntax for inner joins is part of the ANSI standard and so still supported. Listing 3 uses it and will return all authors that live in the same city as their publisher.

  --(Old Syntax)authors that live in the same city as their publishers
SELECT 
  authors.au_fname + ' ' + authors.au_lname AS Author, 
  publishers.pub_name AS Publisher, publishers.city
  FROM dbo.authors, dbo.titleauthor, dbo.titles, dbo.publishers
  WHERE authors.au_id = titleauthor.au_id 
    AND titleauthor.title_id = titles.title_id 
    AND titles.pub_id = publishers.pub_id 
    AND publishers.city = authors.city

Listing 3

Listing 4 shows the same code using the ANSI-92 standard, where we make the join types explicit.

--(Newer Syntax) authors that live in the same city as their publishers
SELECT 
  authors.au_fname+ ' '+authors.au_lname AS Author, 
  publishers.pub_name AS Publisher, publishers.city
  FROM dbo.authors
    INNER JOIN dbo.titleauthor
      ON authors.au_id = titleauthor.au_id
    INNER JOIN dbo.titles
      ON titleauthor.title_id = titles.title_id
    INNER JOIN dbo.publishers
      ON titles.pub_id = publishers.pub_id
  WHERE publishers.city = authors.city

Listing 4

Both give the same result, with identical execution plans.

However, it is broadly accepted that the old-style "citation list" inner join syntax is much harder to read and understand, and therefore more error-prone than the newer syntax.

In any case, there is no cause for regret at leaving behind this old-style syntax, even though it is still supported. How, for example, would you determine the percentage of authors that live in different cities from their publishers, using the old style join syntax? It would be a complex-looking query using brackets and subqueries. With the newer syntax, it is simple to write, simple to understand its logic.

--proportion of authors who live in the same city as their publishers
SELECT 
  Sum(CASE WHEN publishers.city IS NULL THEN 1 ELSE 0 END) AS [different city],
  Count(*) AS total, 
  (Sum(CASE WHEN publishers.city IS NULL THEN 1 ELSE 0 END) * 100)
      / Count(*) AS percentage
  FROM dbo.authors
    INNER JOIN dbo.titleauthor
      ON authors.au_id = titleauthor.au_id
    INNER JOIN dbo.titles
      ON titleauthor.title_id = titles.title_id
    LEFT OUTER JOIN dbo.publishers
      ON titles.pub_id = publishers.pub_id AND publishers.city = authors.city

Listing 5

Conclusion

There are no advantages to leaving old-style join syntax in legacy code. If you spot this code smell, it will improve and simplify the code to rewrite the statements to use ANSI-standard join syntax.

New whitepaper: Database DevOps – 6 Tips for Achieving Continuous Delivery. Discover 6 tips for continuous delivery with Database DevOps in this new whitepaper from Redgate. In 9 pages, it covers version control for databases and configurations, branching and testing, automation, using NuGet packages, and advice for how to start a pioneering Database DevOps project. Also includes further research on the industry-wide state of Database DevOps, how application and database development compare, plus practical steps for bringing DevOps to your database. Read it now free.

Topics:
sql ,code smells ,sql prompt ,code analysis ,database ,tutorial

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}