SQL: SELECT


Note: All the examples in this article use the Classic Models sample database.

SELECT, retrieves data from a database.

Probably, the simplest example is:

This statement retrieves the Customer’s name of all distributors in Customer Table.

Syntax:

 

Each select_expr indicates a column that you want to retrieve. For instance:

Table_references indicates the table or tables from which to retrieve rows. For example:

The WHERE clause indicates the condition or conditions that rows must satisfy to be selected. where_condition is an expression that evaluates to true for each row to be selected.

In the WHERE clause, you can use any of the functions and operators that SQL supports, except for aggregate (summary) functions.

Calculated Columns

In addition to columns, select_expr can include calculated columns.

To request a calculated column, we specify a SQL expression in the select list.

SQL expressions

The ANSI/ISO SQL standard specifies four arithmetic operations that can be used in expressions:

addition (X + Y) subtraction (X – Y)
multiplication (X * Y) division (X / Y)

 

Parentheses can also be used to form more complicated expressions, or to change the precedence order. For instance:

(SALES * 1.05) – (TARGET * .95)

 

select_expr also could be a built-in function:

SQL Functions

 

SQL has many built-in functions for performing calculations on data. The most important are Aggregate Functions and Scalar Functions.

 

SQL Aggregate Functions

SQL aggregate functions return a single value, calculated from values in a column. For instance:

AVG() – Returns the average value

COUNT() – Returns the number of rows

FIRST() – Returns the first value

LAST() – Returns the last value

MAX() – Returns the largest value

MIN() – Returns the smallest value

SUM() – Returns the sum

How many orders does the OrderDetails table have?

If you want to go further in built-in functions, please visit: Summary queries

Aliases

SQL aliases are used to temporarily rename a table (after the FROM clause) or a column heading.

Table aliases are required in queries involving self-joins (we will see it later). However, you can use an alias in any query. For example, if the name of a table is very long, the table name can become tedious to type as a column qualifier.

The syntax is:

The keyword AS is optional. It means

TIP for COLUMN NAME: It requires double quotation marks or square brackets if the column name contains spaces or is a reserved keyword.

For instance, the following statement is wrong:

We can observe two problems: on is a reserved keyword, and Order Date contains white spaces. A corrected statement could be:

or

 

Selecting All Columns (SELECT *)

Sometimes it’s convenient to display the contents of all the columns of a table, especially when we are exploring a database for first time.

Duplicate Rows (DISTINCT)

Scenario A

If a query includes the primary key of a table in its select list, then every row of query results will be unique (because the primary key has a different value in each row).

Scenario B

If the primary key is not included in the query results, duplicate rows can occur. For example, suppose you made this request:

List the customer numbers of all distributors that have orders.

 

The query results have 326 rows (one for each order).

As we can guess, a customer could have placed many orders. Therefore, our list will contain duplicated customer numbers. These query results are probably not exactly what we had in mind.

If we want to list the customer numbers (that have orders) only once, we can use the DISTINCT keyword.

The query results have 98 rows.

 

The FROM Clause

The keyword FROM is followed by a list of table specifications separated by commas.

These tables are called the source tables of the query.

We have already used the FROM clause in its simplest form. But a table reference is also known as a JOIN expression.

To go further with Join expressions, please visit the article: JOINS.

Row Selection (WHERE Clause)

Usually we want to select only some of the rows in a table. The WHERE clause allows us to specify the rows you want to retrieve. For instance:

List the customer numbers that placed an order after “2005-01-01”.

The WHERE clause consists of the keyword WHERE followed by a search condition that specifies the rows to be retrieved.

Search Conditions

SQL offers a rich set of search conditions that allow us to specify many different kinds of queries efficiently and naturally.

The five basic search conditions (a.k.a. predicates) are:

  1. Comparison test

Compares the value of one expression to the value of another expression:

The example List the customer numbers that placed an order after “2005-01-01” was a search condition of Comparison test type.

  1. Range test (BETWEEN)

Tests whether the value of an expression falls within a specified range of values:

For example:

TIP:

is completely equivalent to:

 

  1. Set membership test (IN)

Checks whether the value of an expression matches one of a set of values.

For instance:

List the distributors that are based in Las Vegas and San Francisco.

TIP:

is completely equivalent to:

 

  1. Pattern matching test (LIKE)

Checks whether the value of a string column matches a specified pattern.

For instance:

In this example, LIKE has the same effect as = (equal symbol). But it becomes powerful when we need to use wildcard characters.

Wildcard Characters

The percent sign (%)

This wildcard character matches any sequence of zero or more characters. For instance:

List the distributors that are based in cities that start with “N”.

The LIKE keyword tells SQL to compare the CITY column to the pattern “N%” Any of the following names would match the pattern:

NYC, Nantes, New Haven…

 

The underscore (_)

This wildcard character matches any single character. It means, this character will be replaced for only one character (not many).

For instance, the following statement will list all costumers whose state starts with N and has two characters as maximum.

In this case, the following state will not match the pattern:

NSW

 

Escape Characters

Here is THE question: how to match the wildcard characters themselves as literal characters?

Let’s assume we have a product with a special feature called: Special_ID. Let’s assume also that this singular feature has the format: X%XXX. For instance: 1%3A3, 1%4A3, A%BB, 9%C7…

If we run the following statement:

SQL will show us ALL of the products whose Special_ID starts with 1 and ends with 3A3, no matter the length. And, of course, our product will remain out of the list.

To solve this problem, we need to use an escape character.

When the escape character appears in the pattern, the character immediately following it is treated as a literal character rather than as a wildcard character.

 

The first percent sign in the pattern, which follows an escape character, is treated as a literal percent sign.

 

  1. Null value test (NULL)

Checks whether a column has a NULL (unknown) value.

Find the customers who don’t have an addressLine2 assigned.

 

Compound Search Conditions (AND, OR, and NOT)

Finally, we can use compound search conditions to build more complex queries.

The AND & OR operators are used to filter rows based on more than one condition.

These operations are logical operations; therefore we need to be careful with precedence order.

Sorting Query Results (ORDER BY Clause)

Like the rows of a table, the rows of query results are not arranged in any particular order.

SQL provides a way to sort the results of a query: by including the ORDER BY clause.

The ORDER BY clause must be followed by a list of sort specifications separated by commas. For instance:

List the customer names, state and city, sorted in alphabetical order by state, and within each state by city.

If we don’t specify the keyword ASC or DESC, ASC will be taken as default.

The first sort specification (STATE) is the major sort key; those that follow (CITY, in this case) are progressively more minor sort keys, used as “tie breakers” when two rows of query results have the same values for the more major keys.

If the column of query results to be used for sorting is a calculated column, it has no column name. In this case, we must specify a column number instead of a column name, as in this example:

The calculated column “Total price” (2) is the sort key.

Grouped Queries (GROUP BY Clause)

The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.

Just as subtotals are useful in printed reports, it’s often convenient to summarize query results at a “subtotal” level.

Syntax:

The function of the GROUP BY clause is most easily understood by example.

What is the total credit limit for all customers in each country?

 

When The GROUP BY clause tells SQL to divide the detailed query results into groups (e.g. by country) and to apply the aggregate_function separately to each group, producing a single result for each group.

We can use several grouping columns. If you are interested, go to: Multiple grouping columns.

NULL Values in Grouping Columns

If a value of the (group by) column is unknown, into which group should the row be placed?

This topic is described in the advanced article: Multiple grouping columns.

Group Search Conditions (HAVING Clause)

The HAVING clause can be used to select and reject row groups.

The HAVING clause is the WHERE clause for grouping. Consequently, it must be followed by a search condition.

For example:

List the total credit limit that doesn´t exceed 10.000, for all customers in each country.

The GROUPBY clause first arranges the Customers into groups by Country.

The HAVING clause then eliminates any group where the total of the credit limit in the group does not exceed $10.000.

Finally, the SELECT clause calculates (again) the total credit limit for each of the remaining groups and generates the query results.

HAVING Without GROUP BY

Generally, the HAVING clause is used in conjunction with the GROUP BY clause, but the syntax of the SELECT statement does not require it.

If a HAVING clause appears without a GROUP BY clause, SQL considers the entire set of detailed query results to be a single group.

Summary

According to “SQL: The Complete Reference“, the process to generate the query results for a SELECT statement is:

  1. If the statement is a UNION of SELECT statements, apply Steps 2 through 7 to 
each of the statements to generate their individual query results.
  2. Form the product of the tables named in the FROM If the FROM clause names a single table, the product is that table.
  3. If there is a WHERE clause, apply its search condition to each row of the product table, retaining those rows for which the search condition is TRUE (and discarding those for which it is FALSE or NULL).
  4. If there is a GROUP BY clause, arrange the remaining rows of the product table into row groups, so that the rows in each group have identical values in all of the grouping columns.
  5. If there is a HAVING clause, apply its search condition to each row group, retaining those groups for which the search condition is TRUE (and discarding those for which it is FALSE or NULL).
  6. For each remaining row (or row group), calculate the value of each item in the select list to produce a single row of query results. For a simple column reference, use the value of the column in the current row (or row group). For a column function, use the current row group as its argument if GROUP BY is specified; otherwise, use the entire set of rows.
  7. If SELECT DISTINCT is specified, eliminate any duplicate rows of query results that were produced.
  8. If the statement is a UNION of SELECT statements, merge the query results for the individual statements into a single table of query results. Eliminate duplicate rows unless UNION ALL is specified.
  9. If there is an ORDER BY clause, sort the query results as specified.

 

 

Share and Enjoy

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS