Subquery
Subquery Fundamentals
A subquery is a SELECT query that returns a single value and is nested inside
a SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. A
subquery can be used anywhere an expression is allowed. In this example a
subquery is used as a column expression named MaxUnitPrice in a SELECT
statement.
SELECT Ord.OrderID, Ord.OrderDate,
(SELECT MAX(OrdDet.UnitPrice)
FROM Northwind.dbo.[Order Details] AS OrdDet
WHERE Ord.OrderID = OrdDet.OrderID) AS MaxUnitPrice
FROM Northwind.dbo.Orders AS Ord
A subquery is also called an inner query or inner select, while the statement
containing a subquery is also called an outer query or outer select.
Many Transact-SQL statements that include subqueries can be alternatively
formulated as joins. Other questions can be posed only with subqueries. In
Transact-SQL, there is usually no performance difference between a statement
that includes a subquery and a semantically equivalent version that does not.
However, in some cases where existence must be checked, a join yields better
performance. Otherwise, the nested query must be processed for each result of
the outer query to ensure elimination of duplicates. In such cases, a join
approach would yield better results. This is an example showing both a subquery
SELECT and a join SELECT that return the same result set:
/* SELECT statement built using a subquery. */
SELECT ProductName
FROM Northwind.dbo.Products
WHERE UnitPrice =
(SELECT UnitPrice
FROM Northwind.dbo.Products
WHERE ProductName = 'Sir Rodney''s Scones')
/* SELECT statement built using a join that returns
the same result set. */
SELECT Prd1.ProductName
FROM Northwind.dbo.Products AS Prd1
JOIN Northwind.dbo.Products AS Prd2
ON (Prd1.UnitPrice = Prd2.UnitPrice)
WHERE Prd2.ProductName = 'Sir Rodney''s Scones'
A subquery nested in the outer SELECT statement has the following components:
- A regular SELECT query including the regular select list
components.
- A regular FROM clause including one or more table or view names.
- An optional WHERE clause.
- An optional GROUP BY clause.
- An optional HAVING clause.
The SELECT query of a subquery is always enclosed in parentheses. It cannot
include a COMPUTE or FOR BROWSE clause, and may only include an ORDER BY clause
when a TOP clause is also specified.
A subquery can be nested inside the WHERE or HAVING clause of an outer
SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. Up to
32 levels of nesting is possible, although the limit varies based on available
memory and the complexity of other expressions in the query. Individual queries
may not support nesting up to 32 levels. A subquery can appear anywhere an
expression can be used, if it returns a single value.
If a table appears only in a subquery and not in the outer query, then
columns from that table cannot be included in the output (the select list of the
outer query).
Statements that include a subquery usually take one of these formats:
- WHERE expression [NOT] IN (subquery)
- WHERE expression comparison_operator [ANY | ALL]
(subquery)
- WHERE [NOT] EXISTS (subquery)
In some Transact-SQL statements, the subquery can be evaluated as if it were
an independent query. Conceptually, the subquery results are substituted into
the outer query (although this is not necessarily how Microsoft® SQL Server™
actually processes Transact-SQL statements with subqueries).
There are three basic types of subqueries. Those that:
- Operate on lists introduced with IN, or those that a comparison operator
modified by ANY or ALL.
- Are introduced with an unmodified comparison operator and must return a
single value.
- Are existence tests introduced with EXISTS.
Subquery Rules
A subquery is subject to a number of restrictions:
- The select list of a subquery introduced with a comparison operator can
include only one expression or column name (except that EXISTS and IN operate
on SELECT * or a list, respectively).
- If the WHERE clause of an outer query includes a column name, it must be
join-compatible with the column in the subquery select list.
- The ntext, text and image data types are not allowed
in the select list of subqueries.
- Because they must return a single value, subqueries introduced by an
unmodified comparison operator (one not followed by the keyword ANY or ALL)
cannot include GROUP BY and HAVING clauses.
- The DISTINCT keyword cannot be used with subqueries that include GROUP
BY.
- The COMPUTE and INTO clauses cannot be specified.
- ORDER BY can only be specified if TOP is also specified.
- A view created with a subquery cannot be updated.
- The select list of a subquery introduced with EXISTS by convention
consists of an asterisk (*) instead of a single column name. The rules for a
subquery introduced with EXISTS are identical to those for a standard select
list because a subquery introduced with EXISTS constitutes an existence test
and returns TRUE or FALSE, rather than data.
Qualifying Column Names in
Subqueries
In the following example, the pub_id column in the WHERE clause of the
outer query is implicitly qualified by the table name in the outer query's FROM
clause, publishers. The reference to pub_id in the select list of
the subquery is qualified by the subquery's FROM clause, that is, by the
titles table.
USE pubs
SELECT pub_name
FROM publishers
WHERE pub_id NOT IN
(SELECT pub_id
FROM titles
WHERE type = 'business'
The general rule is that column names in a statement are implicitly qualified
by the table referenced in the FROM clause at the same level.
Here's what the query looks like with these implicit assumptions
specified:
USE pubs
SELECT pub_name
FROM publishers
WHERE publishers.pub_id NOT IN
(SELECT titles.pub_id
FROM titles
WHERE type = 'business')
It is never wrong to state the table name explicitly, and it is always
possible to override implicit assumptions about table names with explicit
qualifications.
Subquery Types
Subqueries can be specified in many places:
- With aliases
- With IN or NOT IN
- In UPDATE, DELETE, and INSERT statements
- With comparison operators
- With ANY, SOME, or ALL
- With EXISTS or NOT EXISTS
- In place of an expression
Subqueries with Aliases
Many statements in which the subquery and the outer query refer to the same
table can be stated as self-joins (joining a table to itself). For example, you
can find authors who live in the same city as Livia Karsen by using a
subquery:
USE pubs
SELECT au_lname, au_fname, city
FROM authors
WHERE city IN
(SELECT city
FROM authors
WHERE au_fname = 'Livia'
AND au_lname = 'Karsen')
Here is the result set:
au_lname au_fname city
---------------------------------------- -------------------- ----------
Green Marjorie Oakland
Straight Dean Oakland
Stringer Dirk Oakland
MacFeather Stearns Oakland
Karsen Livia Oakland
(5 row(s) affected)
Or you can use a self-join:
USE pubs
SELECT au1.au_lname, au1.au_fname, au1.city
FROM authors AS au1 INNER JOIN authors AS au2 ON au1.city = au2.city
AND au2.au_lname = 'Karsen'
AND au2.au_fname = 'Livia'
Table aliases are required because the table being joined to itself appears
in two different roles. Aliases can also be used in nested queries that refer to
the same table in an inner and outer query.
USE pubs
SELECT au1.au_lname, au1.au_fname, au1.city
FROM authors AS au1
WHERE au1.city in
(SELECT au2.city
FROM authors AS au2
WHERE au2.au_fname = 'Livia'
AND au2.au_lname = 'Karsen')
Explicit aliases make it clear that reference to authors in the
subquery does not mean the same thing as the reference in the outer query.
Subqueries with IN
The result of a subquery introduced with IN (or with NOT IN) is a list of
zero or more values. After the subquery returns results, the outer query makes
use of them.
This query finds the names of the publishers who have published business
books.
USE pubs
SELECT pub_name
FROM publishers
WHERE pub_id IN
(SELECT pub_id
FROM titles
WHERE type = 'business')
Here is the result set:
pub_name
----------------------------------------
Algodata Infosystems
New Moon Books
(2 row(s) affected)
This statement is evaluated in two steps. First, the inner query returns the
identification numbers of the publishers that have published business books
(1389 and 0736). Second, these values are substituted into the outer query,
which finds the names that go with the identification numbers in
publishers.
USE pubs
SELECT pub_name
FROM publishers
WHERE pub_id in ('1389', '0736')
One difference in using a join rather than a subquery for this and similar
problems is that the join lets you show columns from more than one table in the
result. For example, if you want to include the titles of the business books in
the result, you must use a join version.
USE pubs
SELECT pub_name, title
FROM publishers INNER JOIN titles ON publishers.pub_id = titles.pub_id
AND type = 'business'
Here is the result set:
pub_name title
---------------------- -------------------------------------------------
Algodata Infosystems The Busy Executive's Database Guide
Algodata Infosystems Cooking with Computers: Surreptitious Balance
Sheets
New Moon Books You Can Combat Computer Stress!
Algodata Infosystems Straight Talk About Computers
(4 row(s) affected)
This query shows the join produces four rows, not two as in the preceding
subquery.
Here is another example of a query that can be formulated with either a
subquery or a join. This query finds the names of all second authors who live in
California and who receive less than 30 percent of the royalties for a book.
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE state = 'CA'
AND au_id IN
(SELECT au_id
FROM titleauthor
WHERE royaltyper < 30
AND au_ord = 2)
Here is the result set:
au_lname au_fname
---------------------------------------- --------------------
MacFeather Stearns
(1 row(s) affected)
The inner query is evaluated, producing the ID numbers of the three authors
who meet the subquery qualifications. The outer query is then evaluated. Notice
that you can include more than one condition in the WHERE clause of both the
inner and the outer query.
Using a join, the same query is expressed like this:
USE pubs
SELECT au_lname, au_fname
FROM authors INNER JOIN titleauthor ON authors.au_id = titleauthor.au_id
WHERE state = 'CA'
AND royaltyper < 30
AND au_ord = 2
A join can always be expressed as a subquery. A subquery can often, but not
always, be expressed as a join. This is because joins are symmetric: you can
join table A to B in either order and get the same answer. The same is not true
if a subquery is involved.
Subqueries with NOT IN
Subqueries introduced with the keyword NOT IN also return a list of zero or
more values.
This query finds the names of the publishers who have not published business
books.
USE pubs
SELECT pub_name
FROM publishers
WHERE pub_id NOT IN
(SELECT pub_id
FROM titles
WHERE type = 'business')
The query is exactly the same as the one in Subqueries with IN, except that
NOT IN is substituted for IN. However, this statement cannot be converted to a
join. The analogous not-equal join has a different meaning: It finds the names
of publishers who have published some book that is not a business book. For
information about interpreting the meaning of joins not based on equality, see
Joining
Three or More Tables.
Subqueries
in UPDATE, DELETE, and INSERT Statements
Subqueries can be nested in UPDATE, DELETE, and INSERT statements, as well as
in SELECT statements.
The following query doubles the price of all books published by New Moon
Books. The query updates the titles table; its subquery references the
publishers table.
UPDATE titles
SET price = price * 2
WHERE pub_id IN
(SELECT pub_id
FROM publishers
WHERE pub_name = 'New Moon Books')
Here's an equivalent UPDATE statement using a join:
UPDATE titles
SET price = price * 2
FROM titles INNER JOIN publishers ON titles.pub_id = publishers.pub_id
AND pub_name = 'New Moon Books'
You can remove all sales records of business books with this nested
query:
DELETE sales
WHERE title_id IN
(SELECT title_id
FROM titles
WHERE type = 'business')
Here's an equivalent DELETE statement using a join:
DELETE sales
FROM sales INNER JOIN titles ON sales.title_id = titles.title_id
AND type = 'business'
Subqueries with Comparison
Operators
Subqueries can be introduced with one of the comparison operators (=, <
>, >, > =, <, ! >, ! <, or < =).
A subquery introduced with an unmodified comparison operator (a comparison
operator not followed by ANY or ALL) must return a single value rather than a
list of values, like subqueries introduced with IN. If such a subquery returns
more than one value, Microsoft® SQL Server™ displays an error message.
To use a subquery introduced with an unmodified comparison operator, you must
be familiar enough with your data and with the nature of the problem to know
that the subquery will return exactly one value.
For example, if you assume each publisher is located in only one city, and
you want to find the names of authors who live in the city in which Algodata
Infosystems is located, you can write a statement with a subquery introduced
with the simple = comparison operator.
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE city =
(SELECT city
FROM publishers
WHERE pub_name = 'Algodata Infosystems')
Here is the result set:
au_lname au_fname
-------- --------
Carson Cheryl
Bennet Abraham
(2 row(s) affected)
If, however, Algodata Infosystems was located in multiple cities, then an
error message would result. Instead of the = comparison operator, an IN
formulation could be used (= ANY also works).
Subqueries introduced with unmodified comparison operators often include
aggregate functions, because these return a single value. For example, this
statement finds the names of all books priced higher than the current minimum
price.
USE pubs
SELECT DISTINCT title
FROM titles
WHERE price >
(SELECT MIN(price)
FROM titles)
Here is the result set:
title
----------------------------------------------------------
But Is It User Friendly?
Computer Phobic and Non-Phobic Individuals: Behavior Variations
Cooking with Computers: Surreptitious Balance Sheets
Emotional Security: A New Algorithm
Fifty Years in Buckingham Palace Kitchens
Is Anger the Enemy?
Life Without Fear
Onions, Leeks, and Garlic: Cooking Secrets of the Mediterranean
Prolonged Data Deprivation: Four Case Studies
Secrets of Silicon Valley
Silicon Valley Gastronomic Treats
Straight Talk About Computers
Sushi, Anyone?
The Busy Executive's Database Guide
(14 row(s) affected)
Because subqueries introduced with unmodified comparison operators must
return a single value, they cannot include GROUP BY or HAVING clauses unless you
know the GROUP BY or HAVING clause itself returns a single value. For example,
this query finds the books priced higher than the lowest priced book that has a
type 'trad_cook'.
USE pubs
SELECT DISTINCT title
FROM titles
WHERE price >
(SELECT MIN(price)
FROM titles
GROUP BY type
HAVING type = 'trad_cook')
Here is the result set:
title
------------------------------------------------------------------------
But Is It User Friendly?
Computer Phobic AND Non-Phobic Individuals: Behavior Variations
Onions, Leeks, and Garlic: Cooking Secrets of the Mediterranean
Prolonged Data Deprivation: Four Case Studies
Secrets of Silicon Valley
Silicon Valley Gastronomic Treats
Straight Talk About Computers
Sushi, Anyone?
The Busy Executive's Database Guide
(9 row(s) affected)
Comparison
Operators Modified by ANY, SOME, or ALL
Comparison operators that introduce a subquery can be modified by the
keywords ALL or ANY. SOME is an SQL-92 standard equivalent for ANY.
Subqueries introduced with a modified comparison operator return a list of
zero or more values and can include a GROUP BY or HAVING clause. These
subqueries can be restated with EXISTS.
Using the > comparison operator as an example, >ALL means greater than
every value--in other words, greater than the maximum value. For example,
>ALL (1, 2, 3) means greater than 3. >ANY means greater than at least one
value, that is, greater than the minimum. So >ANY (1, 2, 3) means greater
than 1.
For a row in a subquery with >ALL to satisfy the condition specified in
the outer query, the value in the column introducing the subquery must be
greater than each value in the list of values returned by the subquery.
Similarly, >ANY means that for a row to satisfy the condition specified in
the outer query, the value in the column that introduces the subquery must be
greater than at least one of the values in the list of values returned by the
subquery.
Note This example can be run many different
ways, as long as the inner query returns only one value.
USE pubs
-- Option 1 using MAX in the inner query
SELECT title
FROM titles
HAVING MAX(advance) > ALL
WHERE advance > ALL
(
SELECT MAX(advance)
FROM publishers INNER JOIN titles ON
titles.pub_id = publishers.pub_id
WHERE pub_name = 'Algodata Infosystems'
)
-- Option 2 using GROUP BY and HAVING and no ALL
USE pubs
SELECT title
FROM titles
GROUP BY title
HAVING MAX(advance) >
(
SELECT MAX(advance)
FROM publishers INNER JOIN titles ON
titles.pub_id = publishers.pub_id
WHERE pub_name = 'Algodata Infosystems'
)
The following query provides an example of a subquery introduced with a
comparison operator modified by ANY. It finds the titles that received an
advance larger than the minimum advance amount paid by Algodata Infosystems.
USE pubs
SELECT title
FROM titles
WHERE advance > ANY
(SELECT advance
FROM publishers INNER JOIN titles
ON titles.pub_id = publishers.pub_id
AND pub_name = 'Algodata Infosystems')
Here is the result set:
title
---------------------------------------------------------------
You Can Combat Computer Stress!
The Gourmet Microwave
But Is It User Friendly?
Secrets of Silicon Valley
Computer Phobic and Non-Phobic Individuals: Behavior Variations
Life Without Fear
Onions, Leeks, and Garlic: Cooking Secrets of the Mediterranean
Sushi, Anyone?
(8 row(s) affected)
For each title, the inner query finds a list of advance amounts paid by
Algodata. The outer query looks at all values in the list and determines whether
the title currently being considered has commanded an advance larger than any of
those amounts. In other words, it finds titles with advances as large or larger
than the lowest value paid by Algodata.
If the subquery does not return any values, the entire query fails to return
any values.
The =ANY operator is equivalent to IN. For example, to find authors who live
in the same city as a publisher, you can use either IN or =ANY.
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE city IN
(SELECT city
FROM publishers)
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE city = ANY
(SELECT city
FROM publishers)
Here is the result set for either query:
au_lname au_fname
-------- ---------
Carson Cheryl
Bennet Abraham
(2 row(s) affected)
The < >ANY operator, however, differs from NOT IN: < >ANY means
not = a, or not = b, or not = c. NOT IN means not = a, and not = b, and not = c.
<>ALL means the same as NOT IN.
For example, this query finds the authors who live in a city in which no
publisher is located.
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE city <> ANY
(SELECT city
FROM publishers)
Here is the result set:
au_lname au_fname
---------------------------------------- --------------------
White Johnson
Green Marjorie
Carson Cheryl
O'Leary Michael
Straight Dean
Smith Meander
Bennet Abraham
Della Buena Ann
Gringlesby Burt
Locksley Charlene
Greene Morningstar
Blotchet-Halls Reginald
Yokomoto Akiko
del Covello Innes
DeFrance Michel
Stringer Dirk
MacFeather Stearns
Karsen Livia
Panteley Sylvia
Hunter Sheryl
McBadden Heather
Ringer Anne
Ringer Albert
(23 row(s) affected)
The results include all 23 authors because every author lives in a city in
which one or more of the publishers is not located. The inner query finds all
the cities in which publishers are located, and then, for each city, the outer
query finds the authors who don't live there.
However, when you use NOT IN in this query, the results include all the
authors except Cheryl Carson and Abraham Bennet, who live in Berkeley, where
Algodata Infosystems is located.
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE city NOT IN
(SELECT city
FROM publishers)
Here is the result set:
au_lname au_fname
---------------------------------------- --------------------
White Johnson
Green Marjorie
O'Leary Michael
Straight Dean
Smith Meander
Della Buena Ann
Gringlesby Burt
Locksley Charlene
Greene Morningstar
Blotchet-Halls Reginald
Yokomoto Akiko
del Covello Innes
DeFrance Michel
Stringer Dirk
MacFeather Stearns
Karsen Livia
Panteley Sylvia
Hunter Sheryl
McBadden Heather
Ringer Anne
Ringer Albert
(21 row(s) affected)
You can get the same results with the < >ALL operator, which is
equivalent to NOT IN.
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE city <> ALL
(SELECT city
FROM publishers)
Subqueries with EXISTS
When a subquery is introduced with the keyword EXISTS, it functions as an
existence test. The WHERE clause of the outer query tests for the existence of
rows returned by the subquery. The subquery does not actually produce any data;
it returns a value of TRUE or FALSE.
A subquery introduced with EXISTS has the following syntax:
WHERE [NOT] EXISTS (subquery)
This query finds the names of all publishers who publish business books:
USE pubs
SELECT pub_name
FROM publishers
WHERE EXISTS
(SELECT *
FROM titles
WHERE pub_id = publishers.pub_id
AND type = 'business')
Here is the result set:
pub_name
--------------------
New Moon Books
Algodata Infosystems
(2 row(s) affected)
To determine the results of this query, consider each publisher's name in
turn. Does this value cause the subquery to return at least one row? In other
words, does it cause the existence test to evaluate to TRUE?
In this case, the first publisher name is Algodata Infosystems, with
identification number 1389. Are there any rows in the titles table in
which pub_id is 1389 and type is business? If so, Algodata
Infosystems should be one of the values selected. The same process is repeated
for each of the other publisher names.
Notice that subqueries introduced with EXISTS are a bit different from other
subqueries in these ways:
- The keyword EXISTS is not preceded by a column name, constant, or other
expression.
- The select list of a subquery introduced by EXISTS almost always consists
of an asterisk (*). There is no reason to list column names because you are
simply testing for the existence of rows that meet the conditions specified in
the subquery.
The EXISTS keyword is important because often there is no alternative,
nonsubquery formulation. Although some queries formulated with EXISTS cannot be
expressed any other way, all queries that use IN or a comparison operator
modified by ANY or ALL can be expressed with EXISTS.
Examples of queries using EXISTS and equivalent alternatives follow.
Here are two ways to find authors who live in the same city as a
publisher:
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE city =ANY
(SELECT city
FROM publishers)
-- Or
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE exists
(SELECT *
FROM publishers
WHERE authors.city = publishers.city)
Here is the result set for either query:
au_lname au_fname
-------- --------
Carson Cheryl
Bennet Abraham
(2 row(s) affected)
These two queries find titles of books published by any publisher located in
a city that begins with the letter B:
USE pubs
SELECT title
FROM titles
WHERE pub_id IN
(SELECT pub_id
FROM publishers
WHERE city LIKE 'B%')
-- Or
USE pubs
SELECT title
FROM titles
WHERE EXISTS
(SELECT *
FROM publishers
WHERE pub_id = titles.pub_id
AND city LIKE 'B%')
Here is the result set for either query:
title
----------------------------------------------------
The Busy Executive's Database Guide
Cooking with Computers: Surreptitious Balance Sheets
You Can Combat Computer Stress!
Straight Talk About Computers
But Is It User Friendly?
Secrets of Silicon Valley
Net Etiquette
Is Anger the Enemy?
Life Without Fear
Prolonged Data Deprivation: Four Case Studies
Emotional Security: A New Algorithm
(11 row(s) affected)
Subqueries with NOT EXISTS
NOT EXISTS works like EXISTS, except the WHERE clause in which it is used is
satisfied if no rows are returned by the subquery.
For example, to find the names of publishers who do not publish business
books:
USE pubs
SELECT pub_name
FROM publishers
WHERE NOT EXISTS
(SELECT *
FROM titles
WHERE pub_id = publishers.pub_id
AND type = 'business')
Here is the result set:
pub_name
----------------------------------------
Binnet & Hardley
Five Lakes Publishing
Ramona Publishers
GGG&G
Scootney Books
Lucerne Publishing
(6 row(s) affected)
This query finds the titles for which there have been no sales.
USE pubs
SELECT title
FROM titles
WHERE NOT EXISTS
(SELECT title_id
FROM sales
WHERE title_id = titles.title_id)
Here is the result set:
title
----------------------------------
The Psychology of Computer Cooking
Net Etiquette
(2 row(s) affected)
Using
EXISTS and NOT EXISTS to Find Intersection and Difference
Subqueries introduced with EXISTS and NOT EXISTS can be used for two
set-theory operations: intersection and difference. The intersection of two sets
contains all elements that belong to both of the original sets. The difference
contains elements that belong only to the first of the two sets.
The intersection of authors and publishers over the city
column is the set of cities in which both an author and a publisher are
located.
USE pubs
SELECT DISTINCT city
FROM authors
WHERE EXISTS
(SELECT *
FROM publishers
WHERE authors.city = publishers.city)
Here is the result set:
city
--------
Berkeley
(1 row(s) affected)
Of course, this query could be written as a simple join.
USE pubs
SELECT DISTINCT authors.city
FROM authors INNER JOIN publishers
ON authors.city = publishers.city
The difference between authors and publishers over the
city column is the set of cities where an author lives but no publisher
is located, that is, all the cities except Berkeley.
USE pubs
SELECT DISTINCT city
FROM authors
WHERE NOT EXISTS
(SELECT *
FROM publishers
WHERE authors.city = publishers.city)
This query could also be written as:
USE pubs
SELECT DISTINCT city
FROM authors
WHERE city NOT IN
(SELECT city
FROM publishers)
Subqueries Used in
Place of an Expression
In Transact-SQL, a subquery can be substituted anywhere an expression can be
used in SELECT, UPDATE, INSERT, and DELETE statements, except in an ORDER BY
list.
The following example illustrates how you might use this enhancement. This
query finds the price of a popular computer book, the average price of all
books, and the difference between the price of the book and the average price of
all books.
USE pubs
SELECT title, price,
(SELECT AVG(price) FROM titles) AS average,
price-(SELECT AVG(price) FROM titles) AS difference
FROM titles
WHERE type='popular_comp'
Here is the result set:
title price average difference
------------------------ -------------- -------------- ----------------
But Is It User Friendly? 22.95 14.77 8.18
Secrets of Silicon Valley 20.00 14.77 5.23
Net Etiquette (null) 14.77 (null)
(3 row(s) affected)
Multiple Levels of Nesting
A subquery can itself include one or more subqueries. Any number of
subqueries can be nested in a statement.
This query finds the names of authors who have participated in writing at
least one popular computer book.
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE au_id IN
(SELECT au_id
FROM titleauthor
WHERE title_id IN
(SELECT title_id
FROM titles
WHERE type = 'popular_comp'))
Here is the result set:
au_lname au_fname
---------------------------------------- --------------------
Carson Cheryl
Dull Ann
Locksley Charlene
Hunter Sheryl
(4 row(s) affected)
The innermost query returns the title ID numbers PC1035, PC8888, and PC9999.
The query at the next higher level is evaluated with these title IDs and returns
the author ID numbers. Finally, the outer query uses the author IDs to find the
names of the authors.
You can also express this query as a join:
USE pubs
SELECT au_lname, au_fname
FROM authors INNER JOIN titleauthor ON authors.au_id = titleauthor.au_id
JOIN titles ON titleauthor.title_id = titles.title_id
WHERE type = 'popular_comp'
Correlated Subqueries
Many queries can be evaluated by executing the subquery once and substituting
the resulting value or values into the WHERE clause of the outer query. In
queries that include a (also known as a repeating subquery),
the subquery depends on the outer query for its values. This means that the
subquery is executed repeatedly, once for each row that might be selected by the
outer query.
This query finds the names of all authors who earn 100 percent of the shared
royalty (royaltyper) on a book.
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE 100 IN
(SELECT royaltyper
FROM titleauthor
WHERE titleauthor.au_ID = authors.au_id)
Here is the result set:
au_lname au_fname
---------------------------------------- --------------------
White Johnson
Green Marjorie
Carson Cheryl
Straight Dean
Locksley Charlene
Blotchet-Halls Reginald
del Castillo Innes
Panteley Sylvia
Ringer Albert
(9 row(s) affected)
Unlike most of the subqueries shown earlier, the subquery in this statement
cannot be resolved independently of the main query. It needs a value for
authors.au_id, but this value is a variable. It changes as
Microsoft® SQL Server™ examines different rows of the authors table.
That is exactly how this query is evaluated: SQL Server considers each row of
the authors table for inclusion in the results by substituting the value
in each row into the inner query. For example, if SQL Server first examines the
row for Cheryl Carson, the variable authors.au_id takes the value
238-95-7766, which SQL Server substitutes into the inner query.
USE pubs
SELECT royaltyper
FROM titleauthor
WHERE au_id = '238-95-7766'
The result is 100, so the outer query evaluates to:
USE pubs
SELECT au_lname, au_fname
FROM authors
WHERE 100 IN (100)
Because this is true, the row for Cheryl Carson is included in the results.
Go through the same procedure with the row for Abraham Bennet; you'll see that
this row is not included in the results.
Correlated Subqueries with
Aliases
Correlated subqueries can be used in operations such as selecting data from a
table referenced in the outer query. In this case a table alias (also called a
correlation name) must be used to specify unambiguously which table reference to
use. For example, you can use a correlated subquery to find the types of books
published by more than one publisher. Aliases are required to distinguish the
two different roles in which the titles table appears.
USE pubs
SELECT DISTINCT t1.type
FROM titles t1
WHERE t1.type IN
(SELECT t2.type
FROM titles t2
WHERE t1.pub_id <> t2.pub_id)
Here is the result set:
type
----------
business
psychology
(2 row(s) affected)
The preceding nested query is equivalent to this self-join:
USE pubs
SELECT DISTINCT t1.type
FROM titles t1 INNER JOIN titles t2 ON t1.type = t2.type
AND t1.pub_id <> t2.pub_id
Correlated
Subqueries with Comparison Operators
Use a correlated subquery with a comparison operator to find sales where the
quantity is less than the average quantity for sales of that title.
USE pubs
SELECT s1.ord_num, s1.title_id, s1.qty
FROM sales s1
WHERE s1.qty <
(SELECT AVG(s2.qty)
FROM sales s2
WHERE s2.title_id = s1.title_id)
Here is the result set:
ord_num title_id qty
-------------------- -------- ------
6871 BU1032 5
722a PS2091 3
D4482 PS2091 10
N914008 PS2091 20
423LL922 MC3021 15
(5 row(s) affected)
The outer query selects the rows of sales (that is, of s1) one
by one. The subquery calculates the average quantity for each sale being
considered for selection in the outer query. For each possible value of
s1, Microsoft® SQL Server™ evaluates the subquery and puts the record
being considered in the results if the quantity is less than the calculated
average.
Sometimes a correlated subquery mimics a GROUP BY clause. This example finds
all titles that have a price greater than the average for books of its type.
USE pubs
SELECT t1.type, t1.title
FROM titles t1
WHERE t1.price >
(SELECT AVG(t2.price)
FROM titles t2
WHERE t1.type = t2.type)
Here is the result set:
type title
------------ -----------------------------------------------------------
business The Busy Executive's Database Guide
business Straight Talk About Computers
mod_cook Silicon Valley Gastronomic Treats
popular_comp But Is It User Friendly?
psychology Computer Phobic AND Non-Phobic Individuals: Behavior
Variations
psychology Prolonged Data Deprivation: Four Case Studies
trad_cook Onions, Leeks, and Garlic: Cooking Secrets of the
Mediterranean
(7 row(s) affected)
For each possible value of t1, SQL Server evaluates the subquery and
includes the row in the results if the price value of that row is greater
than the calculated average. It is not necessary to group by type explicitly,
because the rows for which average price is calculated are restricted by the
WHERE clause in the subquery.
Correlated Subqueries
in a HAVING Clause
A correlated subquery can also be used in the HAVING clause of an outer
query. This construction can be used to find the types of books for which the
maximum advance is more than twice the average within a given group.
In this case, the subquery is evaluated once for each group defined in the
outer query (once for each type of book).
USE pubs
SELECT t1.type
FROM titles t1
GROUP BY t1.type
HAVING MAX(t1.advance) >=ALL
(SELECT 2 * AVG(t2.advance)
FROM titles t2
WHERE t1.type = t2.type)
Here is the result set:
type
--------
mod_cook
(1 row(s) affected)
Conditional Data Processing
Using CASE
The CASE function is used to evaluate several conditions and return a single
value for each condition. A common use of the CASE function is to replace codes
or abbreviations with more readable values. The following query uses the CASE
function to rename book categories so that they are more understandable.
USE pubs
SELECT
CASE type
WHEN 'popular_comp' THEN 'Popular Computing'
WHEN 'mod_cook' THEN 'Modern Cooking'
WHEN 'business' THEN 'Business'
WHEN 'psychology' THEN 'Psychology'
WHEN 'trad_cook' THEN 'Traditional Cooking'
ELSE 'Not yet categorized'
END AS Category,
CONVERT(varchar(30), title) AS "Shortened Title",
price AS Price
FROM titles
WHERE price IS NOT NULL
ORDER BY 1
Here is the result set:
category shortened title Price
------------------- ------------------------------ -------
Business Cooking with Computers: Surrep 11.95
Business Straight Talk About Computers 19.99
Business The Busy Executive's Database 19.99
Business You Can Combat Computer Stress 2.99
Modern Cooking Silicon Valley Gastronomic Tre 19.99
Modern Cooking The Gourmet Microwave 2.99
Popular Computing But Is It User Friendly? 22.95
Popular Computing Secrets of Silicon Valley 20.00
Psychology Computer Phobic AND Non-Phobic 21.59
Psychology Emotional Security: A New Algo 7.99
Psychology Is Anger the Enemy? 10.95
Psychology Life Without Fear 7.00
Psychology Prolonged Data Deprivation: Fo 19.99
Traditional Cooking Fifty Years in Buckingham Pala 11.95
Traditional Cooking Onions, Leeks, and Garlic: Coo 20.95
Traditional Cooking Sushi, Anyone? 14.99
(16 row(s) affected)
Another use of CASE is to categorize data. The following query uses the CASE
function to categorize prices.
SELECT
CASE
WHEN price IS NULL THEN 'Not yet priced'
WHEN price < 10 THEN 'Very Reasonable Title'
WHEN price >= 10 and price < 20 THEN 'Coffee Table Title'
ELSE 'Expensive book!'
END AS "Price Category",
CONVERT(varchar(20), title) AS "Shortened Title"
FROM pubs.dbo.titles
ORDER BY price
Here is the result set:
Price Category Shortened Title
--------------------- --------------------
Not yet priced The Psychology of Co
Not yet priced Net Etiquette
Very Reasonable Title You Can Combat Compu
Very Reasonable Title The Gourmet Microwav
Very Reasonable Title Life Without Fear
Very Reasonable Title Emotional Security:
Coffee Table Title Is Anger the Enemy?
Coffee Table Title Cooking with Compute
Coffee Table Title Fifty Years in Bucki
Coffee Table Title Sushi, Anyone?
Coffee Table Title The Busy Executive's
Coffee Table Title Straight Talk About
Coffee Table Title Silicon Valley Gastr
Coffee Table Title Prolonged Data Depri
Expensive book! Secrets of Silicon V
Expensive book! Onions, Leeks, and G
Expensive book! Computer Phobic AND
Expensive book! But Is It User Frien
(18 row(s) affected)