by Anthony Molinaro
---Preface---
平台与版本
- DB2 v.8
- Oracle Database 10g (with the exception of a handful of recipes, the solutions will work for Oracle8i Database and Oracle9i Database as well)
- PostgreSQL 8
- SQL Server 2005
- MySQL 5
Supplemental files and examples for this book can be found at
http://examples.oreilly.com/9780596009762/
Tables Used in This Book
select * from emp;
select * from dept;
---Chapter 1 Retrieving Records---
Retrieving Records, introduces very simple queries. Examples include how to use a WHERE clause to restrict rows from your result set, providing aliases for columns in your result set, using an inline view to reference aliased columns, using simple conditional logic, limiting the number of rows returned by a query, returning random records, and finding NULL values. Most of the examples are very simple, but some of them appear in more complex recipes, so it’s a good idea to read this chapter if you’re relatively new to SQL or aren’t familiar with any of the examples listed for this chapter.
1.6. Referencing an Aliased Column in the WHERE Clause
Problem
You have used aliases to provide more meaningful column names for your result set and would like to exclude some of the rows using the WHERE clause. However, your attempt to reference alias names in the WHERE clause fails:
1 select sal as salary, comm as commission2 from emp3 where salary < 5000
Solution
By wrapping your query as an inline view you can reference the aliased columns:
1 select * 2 from (3 select sal as salary, comm as commission4 from emp) x5 where salary < 5000
Why do you need to do this? The WHERE clause is evaluated before the SELECT, thus, SALARY and COMMISSION do not yet exist when the “Problem” query’s WHERE clause is evaluated. Those aliases are not applied until after the WHERE clause processing is complete. However, the FROM clause is evaluated before the WHERE. By placing the original query in a FROM clause, the results from that query are generated before the outermost WHERE clause, and your outermost WHERE clause "sees" the alias names. This technique is particularly useful when the columns in a table are not named particularly well.
1.7. Concatenating Column Values
Problem
You want to return values in multiple columns as one column. For example, you would like to produce this result set from a query against the EMP table:
CLARK WORKS AS A MANAGERKING WORKS AS A PRESIDENTMILLER WORKS AS A CLERK
DB2, Oracle, PostgreSQL
These databases use the double vertical bar as the concatenation operator:
1 select ename ||' WORKS AS A '||job as msg2 from emp3 where deptno=10
MySQL
This database supports a function called CONCAT:
1 select concat(ename, ' WORKS AS A ',job) as msg2 from emp3 where deptno=10
SQL Server
Use the "+” operator for concatenation:
1 select ename + ' WORKS AS A ' + job as msg2 from emp3 where deptno=10
1.8. Using Conditional Logic in a SELECT Statement
Problem
You want to perform IF-ELSE operations on values in your SELECT statement. For example, you would like to produce a result set such that, if an employee is paid \$2000 or less, a message of “UNDERPAID” is returned, if an employee is paid \$4000 or more, a message of “OVERPAID” is returned, if they make somewhere in between, then “OK” is returned. The result set should look like this
Solution
Use the CASE expression to perform conditional logic directly in your SELECT statement:
1 select ename, sal,2 case when sal <= 2000 then 'UNDERPAID'3 when sal >= 4000 then 'OVERPAID'4 else 'OK'5 end as status6 from emp
1.12. Transforming Nulls into Real Values
Problem
You have rows that contain nulls and would like to return non-null values in place of those nulls.
Solution
Use the function COALESCE to substitute real values for nulls:
1 select coalesce(comm,0)2 from emp
1.13. Searching for Patterns
Problem
Of the employees in departments 10 and 20, you want to return only those that have either an “I” somewhere in their name or a job title ending with “ER”:
Solution
Use the LIKE operator in conjunction with the SQL wildcard operator (”%”):
select ename, jobfrom empwhere deptno in (10,20)and (ename like '%I%' or job like '%ER')
---Chapter 2 Sorting Query Results---
Sorting Query Results, introduces recipes for sorting query results. The ORDER BY clause is introduced and is used to sort query results. Examples increase in complexity ranging from simple, single-column ordering, to ordering by substrings, to ordering based on conditional expressions.
2.3. Sorting by Substrings
Problem
You want to sort the results of a query by specific parts of a string. For example, you want to return employee names and jobs from table EMP and sort by the last two characters in the job field.
Solution
DB2, MySQL, Oracle, and PostgreSQL
Use the SUBSTR function in the ORDER BY clause:
1 select ename,job2 from emp3 order by substr(job,length(job)-2)
SQL Server
Use the SUBSTRING function in the ORDER BY clause:
1 select ename,job2 from emp3 order by substring(job,len(job)-2, 2)
2.4. Sorting Mixed Alphanumeric Data
Problem
You have mixed alphanumeric data and want to sort by either the numeric or character portion of the data. Consider this view:
Solution
Oracle and PostgreSQL
Use the functions REPLACE and TRANSLATE to modify the string for sorting:
1 /* ORDER BY DEPTNO */2 select data3 from V4 order by replace(data, replace(translate(data,'0123456789','##########'),'#',''),'')
---Chapter 3 Working with Multiple Tables---
Working with Multiple Tables. Examples in this chapter include performing both inner and outer joins, identifying Cartesian productions, basic set operations (set difference, union, intersection), and the effects of joins on aggregate functions.
3.1. Stacking One Rowset atop Another
Problem
You want to return data stored in more than one table, conceptually stacking one result set atop the other. The tables do not necessarily have a common key, but their columns do have the same data types. For example, you want the result set to look like the following:
Solution
Use the set operation UNION ALL to combine rows from multiple tables:
1 select ename as ename_and_dname, deptno2 from emp3 where deptno = 104 union all5 select '----------', null6 from t17 union all8 select dname, deptno9 from dept
3.4. Retrieving Values from One Table That Do Not Exist in Another
Problem
You wish to find those values in one table, call it the source table, that do not also exist in some target table. For example, you want to find which departments (if any) in table DEPT do not exist in table EMP. In the example data, DEPTNO 40 from table DEPT does not exist in table EMP, so the result set should be the following:
DEPTNO
----------
40
Solution
Having functions that perform set difference is particularly useful for this problem. DB2, PostgreSQL, and Oracle support set difference operations. If your DBMS does not support a set difference function, use a subquery as shown for MySQL and SQL Server.
DB2 and PostgreSQL
Use the set operation EXCEPT:
select deptno from deptexceptselect deptno from emp
Oracle
Use the set operation MINUS:
1 Use the set operation MINUS:2 select deptno from dept3 minus4 select deptno from emp
MySQL and SQL Server
Use a subquery to return all DEPTNOs from table EMP into an outer query that searches table DEPT for rows that are not amongst the rows returned from the subquery:
1 select deptno2 from dept3 where deptno not in (select deptno from emp)
3.6. Adding Joins to a Query Without Interfering with Other Joins
Problem
You have a query that returns the results you want. You need additional information, but when trying to get it, you lose data from the original result set. For example, you want to return all employees, the location of the department in which they work, and the date they received a bonus. For this problem, the EMP_BONUS table contains the following data:
Your desired result set is the following:
Solution
You can use an outer join to obtain the additional information without losing the data from the original query. First join table EMP to table DEPT to get all employees and the location of the department they work, then outer join to table EMP_ BONUS to return the date of the bonus if there is one. Following is the DB2, MySQL, PostgreSQL, and SQL Server syntax:
1 select e.ename, d.loc, eb.received2 from emp e join dept d3 on (e.deptno=d.deptno)4 left join emp_bonus eb5 on (e.empno=eb.empno)6 order by 2
You can also use a scalar subquery (a subquery placed in the SELECT list) to mimic an outer join:
select e.ename, d.loc,(select eb.received from emp_bonus ebwhere eb.empno=e.empno) as receivedfrom emp e, dept dwhere e.deptno=d.deptno6 order by 2
---Chapter 4 Inserting, Updating, Deleting---
Inserting, Updating, Deleting, introduces recipes for inserting, updating, and deleting data, respectively. Most of the examples are very straightforward (perhaps even pedestrian). Nevertheless, operations such as inserting rows into one table from another table, the use of correlated subqueries in updates, an understanding of the effects of NULLs, and knowledge of new features such as multi-table inserts and the MERGE command are extremely useful for your toolbox.
4.1. Inserting a New Record
Problem
You want to insert a new record into a table. For example, you want to insert a new record into the DEPT table. The value for DEPTNO should be 50, DNAME should be “PROGRAMMING”, and LOC should be “BALTIMORE”.
Solution
Use the INSERT statement with the VALUES clause to insert one row at a time:
1 insert into dept (deptno,dname,loc)2 values (50,'PROGRAMMING','BALTIMORE')
For DB2 and MySQL you have the option of inserting one row at a time or multiple rows at a time by including multiple VALUES lists:
1 /* multi row insert */2 insert into dept (deptno,dname,loc)3 values (1,'A','B'), (2,'B','C')
As a shortcut, you can omit the column list in an INSERT statement:
1 insert into dept2 values (50,'PROGRAMMING','BALTIMORE')
4.7. Blocking Inserts to Certain Columns
Problem
You wish to prevent users, or an errant software application, from inserting values into certain table columns. For example, you wish to allow a program to insert into EMP, but only into the EMPNO, ENAME, and JOB columns.
Solution
Create a view on the table exposing only those columns you wish to expose. Then force all inserts to go through that view. For example, to create a view exposing the three columns in EMP:
1 create view new_emps as2 select empno, ename, job3 from emp
Grant access to this view to those users and programs allowed to populate only the three fields in the view. Do not grant those users insert access to the EMP table. Users may then create new EMP records by inserting into the NEW_EMPS view, but they will not be able to provide values for columns other than the three that are specified in the view definition.
4.8. Modifying Records in a Table
Problem
You want to modify values for some or all rows in a table. For example, you might want to increase the salaries of everyone in department 20 by 10%.
Solution
Use the UPDATE statement to modify existing rows in a database table. For example:
1 update emp2 set sal = sal*1.103 where deptno = 20
4.16. Deleting Duplicate Records
Problem
You want to delete duplicate records from a table. Consider the following table dupes:
Solution
Use a subquery with an aggregate function such as MIN to arbitrarily choose the ID to retain (in this case only the NAME with the smallest value for ID is not deleted):
delete from dupeswhere id not in ( select min(id)from dupesgroup by name )
---Chapter 5 Metadata Queries---
Metadata Queries, introduces recipes for getting at your database metadata. It’s often very useful to find the indexes, constraints, and tables in your schema. The simple recipes here allow you to gain information about your schema. Additionally, “dynamic” SQL examples are shown here as well, i.e., SQL generated by SQL.
5.1. Listing Tables in a Schema
Problem
You want to see a list all the tables you’ve created in a given schema.
Solution
The solutions that follow all assume you are working with the SMEAGOL schema. The basic approach to a solution is the same for all RDBMSs: you query a system table (or view) containing a row for each table in the database.
DB2
Query SYSCAT.TABLES:
1 select tabname2 from syscat.tables3 where tabschema = 'SMEAGOL'
Oracle
Query SYS.ALL_TABLES:
select table_namefrom all_tableswhere owner = 'SMEAGOL'
PostgreSQL, MySQL, and SQL Server
Query INFORMATION_SCHEMA.TABLES:
1 select table_name2 from information_schema.tables3 where table_schema = 'SMEAGOL'
5.6. Using SQL to Generate SQL
Problem
You want to create dynamic SQL statements, perhaps to automate maintenance tasks. You want to accomplish three tasks in particular: count the number of rows in your tables, disable foreign key constraints defined on your tables, and generate insert scripts from the data in your tables.
Solution
The concept is to use strings to build SQL statements, and the values that need to be filled in (such as the object name the command acts upon) will be supplied by data from the tables you are selecting from. Keep in mind, the queries only generate the statements; you must then run these statements via script, manually, or however you execute your SQL statements. The examples below are queries that would work on an Oracle system. For other RDBMSs the technique is exactly the same, the only difference being things like the names of the data dictionary tables and date formatting.
1 /* generate SQL to count all the rows in all your tables */2 select 'select count(*) from '||table_name||';' cnts3 from user_tables;
1 /* disable foreign keys from all tables */2 select 'alter table '||table_name|| ' disable constraint '||constraint_name||';' cons3 from user_constraints4 where constraint_type = 'R';
1 /* generate an insert script from some columns in table EMP */ 2 select 'insert into emp(empno,ename,hiredate) '||chr(10)|| 'values( '||empno||','||''''||ename ||''',to_date('||''''||hiredate||''') );' inserts3 from emp4 where deptno = 10;
---Chapter 6 Working with Strings---
Working with Strings, introduces recipes for manipulating strings. SQL is not known for its string parsing capabilities, but with a little creativity (usually involving Cartesian products) along with the vast array of vendorspecific functions, you can accomplish quite a bit. This chapter is where the book begins to get interesting. Some of the more interesting examples include counting the occurrences of a character in a string, creating delimited lists from table rows, converting delimited lists and strings into rows, and separating numeric and character data from a string of alphanumeric characters.
6.1. Walking a String
Problem
You want to traverse a string to return each character as a row, but SQL lacks a loop operation. For example, you want to display the ENAME “KING” from table EMP as four rows, where each row contains just characters from “KING”.
Solution
Use a Cartesian product to generate the number of rows needed to return each character of a string on its own line. Then use your DBMS’s built-in string parsing function to extract the characters you are interested in (SQL Server users will use SUBSTRING instead of SUBSTR and DATALENGTH instead of LENGTH):
1 select substr(e.ename,iter.pos,1) as C2 from (select ename from emp where ename = 'KING') e,3 (select id as pos from t10) iter4 where iter.pos <= length(e.ename)
Discussion
The key to iterating through a string’s characters is to join against a table that has enough rows to produce the required number of iterations. This example uses table T10, which contains 10 rows (it has one column, ID, holding the values 1 through 10). The maximum number of rows that can be returned from this query is 10.
The following example shows the Cartesian product between E and ITER (i.e., between the specific name and the 10 rows from T10) without parsing ENAME:
1 select ename, iter.pos2 from (select ename from emp where ename = 'KING') e,3 (select id as pos from t10) iter
6.3. Counting the Occurrences of a Character in a String
Problem
You want to count the number of times a character or substring occurs within a given string. Consider the following string:
10,CLARK,MANAGER
You want to determine how many commas are in the string.
Solution
Subtract the length of the string without the commas from the original length of the string to determine the number of commas in the string. Each DBMS provides functions for obtaining the length of a string and removing characters from a string. In most cases, these functions are LENGTH and REPLACE, respectively (SQL Server users will use the built-in function LEN rather than LENGTH):
1 select (length('10,CLARK,MANAGER')-length(replace('10,CLARK,MANAGER',',','')))/length(',')2 as cnt3 from t1
6.10. Creating a Delimited List from Table Rows
Problem
You want to return table rows as values in a delimited list, perhaps delimited by commas, rather than in vertical columns as they normally appear. You want to convert a result set from this:
Solution
Each DBMS requires a different approach to this problem. The key is to take advantage of the built-in functions provided by your DBMS. Understanding what is available to you will allow you to exploit your DBMS’s functionality and come up with creative solutions for a problem that is typically not solved in SQL.
MySQL
Use the built-in function GROUP_CONCAT to build the delimited list:
1 select deptno,2 group_concat(ename order by empno separator, ',') as emps3 from emp4 group by deptno
SQL Server
Use recursive WITH to build the delimited list:
1 with x (deptno, cnt, list, empno, len) 2 as ( 3 select deptno, count(*) over (partition by deptno), cast(ename as varchar(100)),empno,1 4 from emp 5 union all 6 select x.deptno, x.cnt, cast(x.list + ',' + e.ename as varchar(100)), e.empno, x.len+1 7 from emp e, x 8 where e.deptno = x.deptno 9 and e.empno > x. empno10 )11 select deptno,list12 from x13 where len = cnt14 order by 1
---Chapter 7 Working with Numbers---
Working with Numbers, introduces recipes for common number crunching. The recipes found here are extremely common and you’ll learn how easily window functions solve problems involving moving calculations and aggregations. Examples include creating running totals; finding mean, median, and mode; calculating percentiles; and accounting for NULL while performing aggregations.
7.9. Calculating a Mode
Problem
You want to find the mode (for those of you who don’t recall, the mode in mathematics is the element that appears most frequently for a given set of data) of the values in a column. For example, you wish to find mode of the salaries in DEPTNO 20. Based on the following salaries:
SAL
----------
800
1100
2975
3000
3000
the mode is 3000.
Solution
DB2 and SQL Server
Use the window function DENSE_RANK to rank the counts of the salaries to facilitate extracting the mode:
1 select sal 2 from ( 3 select sal, dense_rank()over( order by cnt desc) as rnk 4 from ( 5 select sal, count(*) as cnt 6 from emp 7 where deptno = 20 8 group by sal 9 ) x10 ) y11 where rnk = 1
The results are ranked based on the number of times each SAL occurs as is seen below:
MySQL and PostgreSQL
Use a subquery to find the mode:
1 select sal2 from emp3 where deptno = 204 group by sal5 having count(*) >= all ( select count(*)6 from emp7 where deptno = 208 group by sal )
7.10. Calculating a Median
Problem
SAL
----------
800
1100
2975
3000
3000
the median is 2975.
Solution
Other than the Oracle solution (which uses supplied functions to compute a median), all of the solutions are based on the method described by Rozenshtein, Abramovich, and Birger in Optimizing Transact-SQL: Advanced Programming Techniques (SQL Forum Press, 1997). The introduction of window functions allows for a more efficient solution compared to the traditional self join.
MySQL and PostgreSQL
Use a self join to find the median:
1 select avg(sal)2 from (3 select e.sal4 from emp e, emp d5 where e.deptno = d.deptno and e.deptno = 206 group by e.sal7 having sum(case when e.sal = d.sal then 1 else 0 end) >= abs(sum(sign(e.sal - d.sal)))8 )
The median is computed by first self joining table EMP, which returns a Cartesian product for all the salaries (but the GROUP BY on E.SAL will prevent duplicates from being returned). The HAVING clause uses the function SUM to count the number of times E.SAL equals D.SAL; if this count is greater than or equal to the number of times E.SAL is greater than D.SAL then that row is the median.
SQL Server
Use the window functions COUNT(*) OVER and ROW_NUMBER to find the median:
1 select avg(sal) 2 from ( 3 select sal, 4 count(*)over() total, 5 cast(count(*)over() as decimal)/2 mid, 6 ceiling(cast(count(*)over() as decimal)/2) next, 7 row_number()over( order by sal) rn 8 from emp 9 where deptno = 2010 ) x11 where ( total%2 = 0 and rn in ( mid, mid+1 ) )12 or ( total%2 = 1 and rn = next)
7.13. Computing Averages Without High and Low Values
Problem
You want to compute an average, but you wish to exclude the highest and lowest values in order to (hopefully) reduce the effect of skew. For example, you want to compute the average salary of all employees excluding the highest and lowest salaries.
Solution
MySQL and PostgreSQL
Use subqueries to exclude high and low values:
1 select avg(sal)2 from emp3 where sal not in (4 (select min(sal) from emp),5 (select max(sal) from emp)6 )
DB2, Oracle, and SQL Server
Use an inline view with the windowing functions MAX OVER and MIN OVER to generate a result set from which you can easily eliminate the high and low values:
1 select avg(sal)2 from (3 select sal, min(sal)over() min_sal, max(sal)over() max_sal4 from emp5 ) x6 where sal not in (min_sal,max_sal)
Chapter 8, Date Arithmetic, is the first of two chapters dealing with dates. Being able to perform simple date arithmetic is crucial to everyday tasks. Examples include determining the number of business days between two dates, calculating the difference between two dates in different units of time (day, month, year, etc.), and counting occurrences of days in a month.
Chapter 9, Date Manipulation, is the second of the two chapters dealing with dates. In this chapter you will find recipes for some of the most common date operations you will encounter in a typical work day. Examples include returning all days in a year, finding leap years, finding first and last days of a month, creating a calendar, and filling in missing dates for a range of dates.
Chapter 10, Working with Ranges, introduces recipes for identifying values in ranges, and for creating ranges of values. Examples include automatically generating a sequence of rows, filling in missing numeric values for a range of values, locating the beginning and end of a range of values, and locating consecutive values.
I skipped the above three chapters about date manipulation.
See the second part of the notes in <SQL COOKBOOK Ch.11-Appendix B>