Movatterモバイル変換

[0]ホーム

Jump to content

Structured Query Language/SELECT: Fundamentals

Add links

From Wikibooks, open books for an open world

<Structured Query Language

Example Database Data

Structured Query Language
SELECT: Fundamentals

SELECT: Join Operation

The SELECT command retrieves data from one or more tables or views. It generally consists of the following language elements:

SELECT<things_to_be_displayed>-- the so called 'Projection' - mostly a list of columnnamesFROM<tablename>-- table or view names and their aliasesWHERE<where_clause>-- the so called 'Restriction' or 'search condition'GROUPBY<group_by_clause>HAVING<having_clause>ORDERBY<order_by_clause>OFFSET<offset_clause>FETCH<fetch_first_or_next_clause>;

With the exception of the first two elements all others are optional. The sequence of language elements is mandatory. At certain places within the command there may start new SELECT commands - in a recursive manner.

Projection (specify resulting columns)

[edit |edit source]

In the projection part of the SELECT command, you specify a list of columns, operations working on columns, functions, fixed values, or new SELECT commands.

-- C/Java style comments are possible within SQL commandsSELECTid,/* the name of a column   */concat(firstname,lastname),/* the concat() function  */weight+5,/* the add operation      */'kg'/* a value                */FROMperson;

The DBMS will retrieve ten rows, each of which consists of four columns.

We can mix the sequence of columns in any order or retrieve them several times.

SELECTid,lastname,lastname,'weighs',weight,'kg'FROMperson;

The asterisk '*' is an abbreviation for the list of all columns.

SELECT*FROMperson;

For numeric columns, we can apply the usual numeric operators +, -, * and /. There are also many predefined functions depending on the data type: power, sqrt, modulo, string functions, date functions.

Uniqueness via keyword DISTINCT

[edit |edit source]

It is possible to compact the result in the sense of unique values by using the keyword DISTINCT. In this case, all resultingrows, which would be identical, will be compressed to onerow. In other words: duplicates are eliminated - just like in set theory.

-- retrieves ten rowsSELECTlastnameFROMperson;-- retrieves only seven rows. Duplicate values are thrown away.SELECTDISTINCTlastnameFROMperson;-- Hint:-- The keyword 'DISTINCT' refers to the entirety of the resulting rows, which you can imagine as-- the concatenation of all columns. It follows directly behind the SELECT keyword.-- The following query leads to ten rows, although three persons have the same lastname.SELECTDISTINCTlastname,firstnameFROMperson;-- again only seven rowsSELECTDISTINCTlastname,lastnameFROMperson;

Aliases for Column names

[edit |edit source]

Sometimes we want to give resulting columns more descriptive names. We can do so by choosing an alias within the projection. This alias is the new name within the result set. GUIs show the alias as the column label.

-- The keyword 'AS' is optionalSELECTlastnameASfamily_name,weightASweight_in_kgFROMperson;

Functions

[edit |edit source]

There are predefined functions for use in projections (and at some other positions). The most frequently used are:

count(<columnname>|'*'): Counts the number of resulting rows.
max(<columnname>): The highest value in <column> of the resultset. Also applicable on strings.
min(<columnname>): The lowest value in <column> of the resultset. Also applicable on strings.
sum(<columnname>): The sum of all values in a numeric column.
avg(<columnname>): The average of a numeric column.
concat(<columnname_1>, <columnname_2>): The concatenation of two columns. Alternatively the function may be expressed by the '||' operator: <columnname_1> || <columnname_2>

Standard SQL and every DBMS offers many more functions.

We must differentiate between those functions which return one value per row like concat() and those which return only one row per complete resultset like max(). The former one may be mixed in any combination with column names, as shown in the very first example of this page. With the later ones, there exists a problem: If we mix them with a regular column name, the DBMS recognises a contradiction in the query. On the one hand it should retrieve precisely one value (in one row), and on the other hand, it should retrieve a lot of values (in a lot of rows). The reaction of DBMS differs from vendor to vendor. Some throw an error message at runtime - in accordance with the SQL standard -, others deliver suspicious results.

-- works fineSELECTlastname,concat(weight,' kg')FROMperson;-- check the reaction of your DBMS. It should throw an error message.SELECTlastname,avg(weight)FROMperson;

-- a legal mixture of functions resulting in one row with 4 columnsSELECTmin(weight),max(weight),avg(weight)asaverage_1,sum(weight)/count(*)asaverage_2FROMperson;

SELECT within SELECT

[edit |edit source]

If wereally want to see the result of a result-set-oriented-function in combination with columns of more than one row, we can start a very new SELECT on a location where - in simple cases - a column name occurs. This second SELECT is an absolutely independent command. Be careful: It will be executed forevery resulting row of the first SELECT!

-- retrieves 10 rows; notice the additional parenthesis to delimit the two SELECTs from each other.SELECTlastname,(SELECTavg(weight)FROMperson)FROMperson;-- Compute the percentage of each persons weight in relation to the average weight of all personsSELECTlastname,weight,weight*100/(SELECTavg(weight)FROMperson)ASpercentage_of_averageFROMperson;

Table Names

[edit |edit source]

The Keyword FROM is used to specify the table on which the command will work. This table name can be used as an identifier. In the first simple examples prefixing column names with the table name identifier can be used but isn't required. In the later more complex command, the table name identifier is a needed feature.

SELECTperson.firstname,person.lastnameFROMperson;-- Define an alias for the table name (analog to column names). To retain overview we usually-- abbreviate tables by the first character of their name.SELECTp.firstname,p.lastnameFROMpersonASp;-- Hint: not all systems accept keyword 'AS' with table aliases. Omit it in these cases!-- The keyword 'AS' is optional again.SELECTp.firstname,p.lastnameFROMpersonp;

Restriction (specify resulting rows)

[edit |edit source]

In the WHERE clause, we specify some 'search conditions' which are among the named table(s) or view(s). The evaluation of this criteria is - mostly - one of the first things during the execution of a SELECT command. Before any row can be sorted or displayed, it must meet the conditions in the clause.

If we omit the clause, all rows of the table are retrieved. Else the number of rows will be reduced according to the specified criteria. If we specify 'weight < 70', for example, only those rows are retrieved where the weight column stores a value less than 70. The restrictions act onrows oftables by evaluatingcolumn values (sometimes they act on other things like the existence of rows, but for the moment we focus on basic principles). As a result, we can imagine that the evaluation of the 'where clause' produces a list of rows. This list of rows will be processed in further steps like sorting, grouping, or displaying certain columns (projection).

Comparisons

[edit |edit source]

We compare variables, constant values, and results of function calls with each other in the same way as we would do in different programming languages. The only difference is that we use column names instead of variables. The comparison operators must match the given data types they have to operate on. The result of the comparison is a boolean value. If it is 'true', the according row will be processed furthermore. Some examples:

'weight = 70' compares the column 'weight' with the constant value '70' whether the column is equal to the constant value.
'70 = weight': same as before.
'firstname = lastname' compares two columns - each of thesame row - for equality. Names like 'Frederic Frederic' evaluate to true.
'firstname < lastname' is a fair comparison of two columns according to the lexical order of strings.
'LENGTH(firstname) < 5' compares the result of a function call to the constant value '5'. The function LENGTH() operates on strings and returns a number.

Boolean Logic

[edit |edit source]

Often we want to specify more than a single search criterion, e.g., are there people born in San Francisco with lastname Baker? To do this, we specify every necessary comparison independent from the next one and join them together with the boolean operators AND respectively OR.

SELECT*FROMpersonWHEREplace_of_birth='San Francisco'ANDlastname='Baker';

The result of a comparison is a boolean. It may be toggled between 'true' and 'false' by the unary operator NOT.

SELECT*FROMpersonWHEREplace_of_birth='San Francisco'ANDNOTlastname='Baker';-- all except 'Baker'-- for clarification: The NOT in the previous example is a 'unary operation' on the result of the--                    comparison. It's not an addition to the AND.SELECT*FROMpersonWHEREplace_of_birth='San Francisco'AND(NOT(lastname='Baker'));--  same as before, but explicit notated with parenthesis

Theprecedence of comparisons and boolean logic is as follows:

all comparisons
NOT operator
AND operator
OR operator

-- AND (born in SF and lastname Baker; 1 hit as an intermediate result) will be processed before-- OR  (person Yorgos; 1 hit)-- 1 + 1 ==> 2 rowsSELECT*FROMpersonWHEREplace_of_birth='San Francisco'-- 4 hits SFANDlastname='Baker'-- 1 hit BakerORfirstname='Yorgos'-- 1 hit Yorgos;-- Same example with parentheses added to make the precedence explicit.-- AND gets processed before OR.-- results ==> same 2 rows as aboveSELECT*FROMpersonWHERE(place_of_birth='San Francisco'-- 4 hits SFANDlastname='Baker')-- 1 hit BakerORfirstname='Yorgos'-- 1 hit Yorgos;-- AND (person Yorgos Baker; no hit as an intermediate result) will be processed before-- OR  (born in SF; 4 hits)-- 0 + 4 ==> 4 rowsSELECT*FROMpersonWHEREplace_of_birth='San Francisco'-- 4 hits SFORfirstname='Yorgos'-- 1 hit YorgosANDlastname='Baker'-- 1 hit Baker;-- Same example with parentheses added to make the precedence explicit.-- AND gets processed before OR.-- results ==> same 4 rows as aboveSELECT*FROMpersonWHEREplace_of_birth='San Francisco'-- 4 hits SFOR(firstname='Yorgos'-- 1 hit YorgosANDlastname='Baker')-- 1 hit Baker;-- We can modify the sequence of evaluations by specifying parentheses.-- Same as the first example, adding parentheses, one row.SELECT*FROMpersonWHEREplace_of_birth='San Francisco'-- 4 hits SFAND(lastname='Baker'-- 1 hit BakerORfirstname='Yorgos')-- 1 hit Yorgos;

Two abbreviations

Sometimes we shorten the syntax by using the BETWEEN keyword. It defines a lower and upper limit and is primarily used for numeric and date values, but also applicable to strings.

SELECT*FROMpersonWHEREweight>=70ANDweight<=90;-- An equivalent shorter and more expressive wordingSELECT*FROMpersonWHEREweightBETWEEN70AND90;-- BETWEEN includes the two cutting edges

For the comparison of a column or function with several values, we can use the short IN expression.

SELECT*FROMpersonWHERElastname='de Winter'ORlastname='Baker';-- An equivalent shorter and more expressive wordingSELECT*FROMpersonWHERElastnameIN('de Winter','Baker');

FETCH: Pick and Choose certain Rows

[edit |edit source]

Sometimes we are not interested in all resulting rows, e.g.: we may want to see only the first 3 or 10 rows. This can be achieved with the OFFSET and FETCH clauses. OFFSET specifies the number of rows to be skipped (counting from the beginning of the result set), and FETCH specifies the number of rows, after which the delivery of rows shall stop.

SELECT*FROMpersonWHEREplace_of_birth='San Francisco'ORDERBYfirstnameFETCHFIRST2ROWSONLY-- only the first 2 rows;SELECT*FROMpersonORDERBYid-- the WHERE clause (and the ORDER BY clause) are optionalOFFSET5ROWSFETCHFIRST2ROWSONLY-- only the 6th and 7th row (according to the ORDER BY);

Please notice that the OFFSET and FETCH clauses are separate parts of the SELECT command. Some implementations handle this functionality as part of the WHERE clause or with different keywords (ROWNUM, START, SKIP, LIMIT).

The functionality of OFFSET and FETCH can be achieved likewise bywindow functions with their more general syntax.

Grouping

[edit |edit source]

We will offer the GROUP BY clause in combination with the HAVING clause in alater chapter.

Sorting

[edit |edit source]

The DBMS is free to deliver the resulting rows in an arbitrary order. Rows may be returned in the order of the Primary Key, in the chronological order they are stored into the database, in the order of a B-tree organized internal key, or even in random order. Concerning the sequence of delivered rows, the DBMS may do what it wants to do. Don't expect anything.

If we expect a particular order of rows, we must express our wishes explicitly. We can do this in the ORDER BY clause. There we specify a list of column names in combination with an option for ascending or descending sorting.

-- all persons in ascending (which is the default) order of their weightSELECT*FROMpersonORDERBYweight;-- all persons in descending order of their weightSELECT*FROMpersonORDERBYweightdesc;

In the above result, there are two rows with identical values in the columnweight. As this situation leads to random results, we have the possibility to specify more columns. These following columns are processed only for those rows with identical values in all previous columns.

-- All persons in descending order of their weight. In ambiguous cases order the-- additional column place_of_birth ascending: Birmingham before San Francisco.SELECT*FROMpersonORDERBYweightdesc,place_of_birth;

In the ORDER BY clause, we can specify any column of the processed table. We are not limited to the ones which are returned by the projection.

-- same ordering as aboveSELECTfirstname,lastnameFROMpersonORDERBYweightdesc,place_of_birth;

Combine the Language Elements

[edit |edit source]

Only the first two elements of the SELECT command are mandatory: the part up to the first table (or view) name. All others are optional. If we also specify the optional ones, their predetermined sequence must be kept in mind. But they are combinable according to our needs.

-- We have seen on this page: SELECT / FROM / WHERE / ORDER BYSELECTp.lastname,p.weight,p.weight*100/(SELECTavg(p2.weight)FROMpersonp2)ASpercentage_of_averageFROMpersonpWHEREp.weightBETWEEN70AND90ORDERBYp.weightdesc,p.place_of_birth;

Further Information

[edit |edit source]

There is more information about the additional options for the SELECT command.

Exercises

[edit |edit source]

Show hobbyname and remark from the hobby table.

Click to see solution

SELECThobbyname,remarkFROMhobby;

Show hobbyname and remark from the hobby table. Order the result by hobbyname.

Click to see solution

SELECThobbyname,remarkFROMhobbyORDERBYhobbyname;

Show hobbyname and remark from the hobby table. Choose 'Hobby' as first columnname and 'Short_Description_of_Hobby' as second columnname.

Click to see solution

SELECThobbynameasHobby,remarkasShort_Description_of_HobbyFROMhobby;-- columnname without underscore: Use quotesSELECThobbynameas"Hobby",remarkas"Short Description of Hobby"FROMhobby;

Show firstname and lastname of persons born in San Francisco.

Click to see solution

SELECTfirstname,lastnameFROMpersonWHEREplace_of_birth='San Francisco';

Show all information items of persons with lastname 'de Winter'.

Click to see solution

SELECT*FROMpersonWHERElastname='de Winter';

How many rows are stored in the contact table?

Click to see solution

SELECTcount(*)FROMcontact;9

How many E-Mails are stored in the contact table?

Click to see solution

SELECTcount(*)FROMcontactWHEREcontact_type='email';3

What is the mean weight of persons born in San Francisco?

Click to see solution

SELECTavg(weight)FROMpersonWHEREplace_of_birth='San Francisco';71.25

Find persons born after 1979-12-31, which weigh more than / less than 50 kg.

Click to see solution

SELECT*FROMpersonWHEREdate_of_birth>DATE'1979-12-31'ANDweight>50;--SELECT*FROMpersonWHEREdate_of_birth>DATE'1979-12-31'ANDweight<50;

Find persons born in Birmingham, Mumbai, Shanghai or Athens in the order of their firstname.

Click to see solution

SELECT*FROMpersonWHEREplace_of_birth='Birmingham'ORplace_of_birth='Mumbai'ORplace_of_birth='Shanghai'ORplace_of_birth='Athens'ORDERBYfirstname;-- equivalent:SELECT*FROMpersonWHEREplace_of_birthIN('Birmingham','Mumbai','Shanghai','Athens')ORDERBYfirstname;

Find persons born in Birmingham, Mumbai, Shanghai or Athens within the 21. century.

Click to see solution

SELECT*FROMpersonWHERE(place_of_birth='Birmingham'ORplace_of_birth='Mumbai'ORplace_of_birth='Shanghai'ORplace_of_birth='Athens')ANDdate_of_birth>=DATE'2000-01-01';-- equivalent:SELECT*FROMpersonWHEREplace_of_birthIN('Birmingham','Mumbai','Shanghai','Athens')ANDdate_of_birth>=DATE'2000-01-01';

Find persons born between Dallas and Richland ('between' not in the sense of a geographic area but of the lexical order of citynames)

Click to see solution

-- strings have a lexical order. So we can use some operators known-- from numeric data types.SELECT*FROMpersonWHEREplace_of_birth>='Dallas'ANDplace_of_birth<='Richland'ORDERBYplace_of_birth;-- equivalent:SELECT*FROMpersonWHEREplace_of_birthBETWEEN'Dallas'AND'Richland'ORDERBYplace_of_birth;

Which kind of contacts are stored in the contact table? (Only one row per value.)

Click to see solution

SELECTDISTINCTcontact_typeFROMcontact;fixedlineemailicqmobile

How many different kind of contacts are stored in the contact table? (Hint: Count the rows of above query.)

Click to see solution

SELECTcount(DISTINCTcontact_type)FROMcontact;4

Show contact_type, contact_value and a string of the form 'total number of contacts: <x>', where <x> is the quantity of all existing contacts.

Click to see solution

SELECTcontact_type,contact_value,(SELECTconcat('total number of contacts: ',count(*))FROMcontact)FROMcontact;-- Some systems need explicit type casting from numeric to stringSELECTcontact_type,contact_value,(SELECTconcat('total number of contacts: ',cast(count(*)aschar))FROMcontact)FROMcontact;-- The '||' operator is some kind of 'syntactical sugar'. It's an abbreviation for the concat() function.-- The operator is part of the SQL standard, but not implemented by all vendors.SELECTcontact_type,contact_value,(SELECT'total number of contacts: '||count(*)FROMcontact)FROMcontact;

Example Database Data

Structured Query Language
SELECT: Fundamentals

SELECT: Join Operation

Retrieved from "https://en.wikibooks.org/w/index.php?title=Structured_Query_Language/SELECT:_Fundamentals&oldid=4355264"

Category:

Book:Structured Query Language

[8]ページ先頭