| Structured Query Language SELECT: Set Operations | SELECT: Case Expression |
Tables, views, and results of SELECT commands are in somewhat similar to sets of set theory. In this comparison the elements of sets correspond to rows of tables, views, and SELECT results. The differences between set theory and the itemized SQL constructs are:
The comparison between set theory and SQL goes even further. In SQL, we have operations which act onmultisets in the sense of set theory: The SQL operations UNION, INTERSECT, and EXCEPT (some name it MINUS) process intermediatemultisets generated by differents SELECT commands. The operations expect themultisets are of the same type. This means mainly that theymust have the same number of columns. Also, their data type should correlate, but this is not mandatory. If they do not, the DBMS will cast them to a common data type - if possible.
The UNION operation pushes the result of several SELECT commands together. The result of the UNION contains those values, which are in the first or the second intermediate result.
-- Please consider that this is only one command (only ONE semicolon at the very end)SELECTfirstname-- first SELECT commandFROMpersonUNION-- push both intermediate results together to one resultSELECTlastname-- second SELECT commandFROMperson;

This is a single SQL command. It consists of two SELECTs and one UNION operation. The SELECTs are evaluated first. Afterward, their results are pushed together to one single result. In our example, the result contains all lastnames and firstnames in a single column (our example may be of limited help in praxis, it's only a demonstration for the UNION).
DISTINCT / ALL
If we examine the result carefully, we will notice that it consists only of 17 values. The tableperson contains ten rows so that we probably expect twenty values in the result. If we perform the 'SELECT firstname ...' and 'SELECT lastname ...' as separate commands without the UNION, we receive for both commands ten values. The explanation for the three missing values is the UNION command. By default, UNION removes duplicates. Therefore some of the intermediate values are skipped. If we want to obtain the duplicate values we have to modify the UNION operation. Its behavior can be changed with one of the two keywords DISTINCT or ALL. DISTINCT is the default, and it removes duplicate values as we have seen before. ALL will retain all values, including duplicates.
-- remove (that's the default) or keep duplicatesSELECT...UNION[DISTINCT|ALL]SELECT...[-- it is possible to 'UNION' more than 2 intermediate resultsUNION[DISTINCT|ALL]SELECT...];
A hint for Oracle users: The use of the keyword DISTINCT, which is the default, is not accepted by Oracle. Omit it.
General hint
In most cases, the UNION combines SELECT commands on different tables or different columns of the same table. SELECT commands on the same column of a single table usually use the WHERE clause in combination with boolean logic.
-- A very unusual example. People apply such queries on the same table only in combination with very complex WHERE conditions.-- This example would normally be expressed with a phrasing similar to: WHERE lastname IN ('de Winter', 'Goldstein');SELECT*FROMpersonWHERElastname='de Winter'UNIONALLSELECT*FROMpersonWHERElastname='Goldstein';

The INTERSECT operation evaluates to those values, which are in both intermediate results, in the first as well as in the second.
-- As in our example database, there is no example for the INTERSECT we insert a new person.-- This person has the same last name 'Victor' as the first name of another person.INSERTINTOpersonVALUES(21,'Paul','Victor',DATE'1966-04-02','Washington','078-05-1121',66);COMMIT;-- All firstnames which are used as lastname.SELECTfirstname-- first SELECT commandFROMpersonINTERSECT-- looking for common valuesSELECTlastname-- second SELECT commandFROMperson;
A hint to MySQL users: MySQL (5.5) does not support INTERSECT operation. But as it is not an elementary operation, there are workarounds.

The EXCEPT operation evaluates to those values, which are in the first intermediate result but not in the second.
-- All firstname except for 'Victor', because there is a lastname with this value.SELECTfirstname-- first SELECT commandFROMpersonEXCEPT-- are there values in the result of the first SELECT but not of second?SELECTlastname-- second SELECT commandFROMperson;
A hint to MySQL users: MySQL (5.5) does not support the EXCEPT operation. But as it is not an elementary operation, there are workarounds.
A hint to Oracle users: Oracle uses the keyword MINUS instead of EXCEPT.
-- Clean up the example databaseDELETEFROMpersonWHEREid>10;COMMIT;
We can combine set operations with all other elements of SELECT command, in particular with ORDER BY and GROUP BY. But this may lead to some uncertainties. Therefore, we would like to explain some of the details below.
SELECTfirstname-- first SELECT commandFROMpersonUNION-- push both intermediate results together to one resultSELECTlastname-- second SELECT commandFROMpersonORDERBYfirstname;
To which part of the command belongs the ORDER BY? To the first SELECT, the second SELECT, or the result of the UNION? The SQL rules determine that set operations are evaluated before ORDER BY clauses (as always parenthesis can change the order of evaluation). Therefore the ORDER BY sorts the final result and not any of the intermediate results.
We rearrange the example in the hope that things get clear.
-- Equivalent semanticSELECT*FROM(SELECTfirstname-- first SELECT commandFROMpersonUNION-- push both intermediate (unnamed) results together to the next intermediate result 't'SELECTlastname-- second SELECT commandFROMperson)t-- 't' is the name for the intermediate result generated by UNIONORDERBYt.firstname;
First, the two SELECTS are evaluated, afterward the UNION. This intermediate result gets the name 't'. 't' is ordered.
Often one would like the rows from the first SELECT to be ordered independent from the rows of the second SELECT. We can do this by adding a virtual column to the result of each SELECT statement and using the virtual columns in the ORDER BY.
SELECT'1'asdummy,firstnameFROMpersonUNIONSELECT'2',lastnameFROMpersonORDERBYdummy,firstname;
With the GROUP BY clause, things are a little more complicated than with ORDER BY. The GROUP BY refers to the last SELECT or - to say it the other way round - to the SELECT of its direct level.
-- Will not work because the GROUP BY belongs to the second SELECT and not to the UNION!SELECTfirstnameFROMpersonUNIONSELECTlastnameFROMpersonGROUPBYfirstname;---- Works, but possibly not what you want to do.-- The alias name for the (only) column of the UNION is 'firstname'.SELECTfirstnameFROMpersonUNION-- We group over the (only) column of the second SELECT, which is 'lastname' and results in 7 valuesSELECTlastnameFROMpersonGROUPBYlastname;---- Make things clear: rearrange the query to group over the final resultSELECT*FROM(SELECTfirstname-- columnnames of the first SELECT determins the columnnames of the UNIONFROMpersonUNIONSELECTlastnameFROMperson)tGROUPBYt.firstname;-- now we can group over the complete result
Show the lowest, highest and mean weight as a) 3 values of 1 row and b) 1 value in 3 rows.
-- 1 rowSELECTmin(weight),max(weight),avg(weight)FROMperson;-- 3 rowsSELECTmin(weight)FROMpersonUNIONSELECTmax(weight)FROMpersonUNIONSELECTavg(weight)FROMperson;
Extend the previous 3-rows-solution to meet two additional criteria: a) consider only persons born in San Francisco and
b) add a virtual column to show 'Min', 'Max' and 'Avg' according to the correlating numeric values.
SELECT'Min',min(weight)FROMpersonWHEREplace_of_birth='San Francisco'UNIONSELECT'Max',max(weight)FROMpersonWHEREplace_of_birth='San Francisco'UNIONSELECT'Avg',avg(weight)FROMpersonWHEREplace_of_birth='San Francisco';
Extend the previous solution to order the result: the minimum value first, followed by the average and then the highest value.
-- 'ugly' solutionSELECT'1 Min'ASnote,min(weight)FROMpersonWHEREplace_of_birth='San Francisco'UNIONSELECT'3 Max'ASnote,max(weight)FROMpersonWHEREplace_of_birth='San Francisco'UNIONSELECT'2 Avg'ASnote,avg(weight)FROMpersonWHEREplace_of_birth='San Francisco'ORDERBYnote;-- 'clean' solutionSELECT1ASnote,'Min',min(weight)FROMpersonWHEREplace_of_birth='San Francisco'UNIONSELECT3ASnote,'Max',max(weight)FROMpersonWHEREplace_of_birth='San Francisco'UNIONSELECT2ASnote,'Avg',avg(weight)FROMpersonWHEREplace_of_birth='San Francisco'ORDERBYnote;
Create a list of lastnames for persons with a weight greater than 70 kg together with
all e-mail values (one value per row). There is no concordance between lastnames and e-mails.
(This example is not very helpfull for praxis, but instructive.)
SELECTlastnameFROMpersonWHEREweight>70UNIONSELECTcontact_valueFROMcontactWHEREcontact_type='email';
In the previous example the lastname 'de Winter' is shown only once. But there are more than one persons of the family with a weight greater than 70 kg.
Why?
Extend the previous solution to show as much resulting rows as hits to the criteria.
-- Extend 'UNION' to 'UNION ALL'. The default is 'UNION DISTINCT'SELECTlastnameFROMpersonWHEREweight>70UNIONALLSELECTcontact_valueFROMcontactWHEREcontact_type='email';
| Structured Query Language SELECT: Set Operations | SELECT: Case Expression |