Why?
How?
Conclusions
PROC SQL: Why and How
Nate Derby
Stakana Analytics
Seattle, WA
SUCCESS, 4/27/11
Nate Derby PROC SQL: Why and How 1 / 30
Why?
How?
Conclusions
Outline
1
Why?
What?
Why?
2
How?
Introduction, Examples
Working with SAS Data Sets
Other Aspects
3
Conclusions
Nate Derby PROC SQL: Why and How 2 / 30
Why?
How?
Conclusions
What?
Why?
What is SQL?
SQL = Structured Query Language
Pronounced “Sequel” or “S-Q-L”.
Intended for managing data in relational database systems.
Developed by IBM in early 1970s.
Now very robust:
Lots of functionality (not just for queries!).
Implemented in most computing packages (including SAS).
Nate Derby PROC SQL: Why and How 3 / 30
Why?
How?
Conclusions
What?
Why?
Why SQL?
Because of what we just said:
Designed for databases:
Works very well/quickly with large data sets.
Sometimes faster than SAS code.
Sometimes works when SAS code doesn’t.
Easily implements data constraints.
Easily implements views.
Easily implements indexes.
Nate Derby PROC SQL: Why and How 4 / 30
Why?
How?
Conclusions
What?
Why?
Why SQL?
Because of what we just said:
Lots of functionality:
SQL can be used to do many things.
Some things easier in SQL than in base SAS code.
Examples to follow.
Implemented in most computing packages:
Great for program portability.
Can cut and paste between SAS and another language.
Some things aren’t portable.
Nate Derby PROC SQL: Why and How 5 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Intro to SQL in SAS
Implemented in SAS via PROC SQL:
PROC SQL Setup
PROC SQL <options>;
[SQL Statements]
QUIT;
Nate Derby PROC SQL: Why and How 6 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Easy Examples
PROC SQL Code
PROC SQL;
SELECT name, sex, age, height, weight
FROM sashelp.class;
QUIT;
Nate Derby PROC SQL: Why and How 7 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Easy Examples
PROC SQL Code
PROC SQL;
SELECT a.name, a.sex, a.age, a.height, a.weight
FROM sashelp.class a;
QUIT;
Nate Derby PROC SQL: Why and How 8 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Easy Examples
PROC SQL Code
PROC SQL;
SELECT boo.name, boo.sex, boo.age, boo.height, boo.weight
FROM sashelp.class boo;
QUIT;
Nate Derby PROC SQL: Why and How 9 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Easy Examples
PROC SQL Code
PROC SQL;
SELECT
*
FROM sashelp.class;
QUIT;
Nate Derby PROC SQL: Why and How 10 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Easy Examples
PROC SQL Code
PROC SQL;
SELECT a.
*
FROM sashelp.class a;
QUIT;
Nate Derby PROC SQL: Why and How 11 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Easy Examples
PROC SQL Code
PROC SQL;
SELECT name, sex, age, height format=6.2, weight
format=6.2
FROM sashelp.class;
QUIT;
Nate Derby PROC SQL: Why and How 12 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Easy Examples
PROC SQL Code
PROC SQL;
SELECT name, sex, age, height format=6.2, weight
format=6.2
FROM sashelp.class
WHERE sex = 'M';
QUIT;
Nate Derby PROC SQL: Why and How 13 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Easy Examples
PROC SQL Code
PROC SQL;
SELECT name, sex, age, height format=6.2, weight
format=6.2
FROM sashelp.class
WHERE sex = 'M'
ORDER BY age;
QUIT;
Nate Derby PROC SQL: Why and How 14 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
More Interesting Example
PROC SQL Code
PROC SQL;
SELECT sex,
mean( age ) as mage label='Mean Age' format=6.2,
min( height ) as minh label='Min Height' format=6.2,
max( height ) as maxh label='Max Height' format=6.2
FROM sashelp.class
GROUP BY sex;
QUIT;
Nate Derby PROC SQL: Why and How 15 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Much More Interesting Example
Nate Derby PROC SQL: Why and How 16 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Much More Interesting Example
PROC SQL Code
PROC SQL;
SELECT a.name, a.sex, a.age, b.mage
FROM sashelp.class a JOIN (
SELECT sex, mean( age ) as mage
label='Mean Age by Sex' format=6.2
FROM sashelp.class
GROUP BY sex
) b
ON a.sex = b.sex
ORDER BY age, name;
QUIT;
Nate Derby PROC SQL: Why and How 17 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Creating Tables
PROC SQL Code
PROC SQL;NOPRINT;
CREATE TABLE work.blah AS
SELECT name, sex, age, height, weight
FROM sashelp.class;
QUIT;
Nate Derby PROC SQL: Why and How 18 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Creating Tables
PROC SQL Code
PROC SQL NOPRINT;
CREATE TABLE blah AS
SELECT sex, mean( age ) as mage
label='Mean Age by Sex' format=6.2
FROM sashelp.class
GROUP BY sex;
CREATE TABLE final AS
SELECT a.name, a.sex, a.age, b.mage
FROM sashelp.class a join blah b
ON a.sex = b.sex
ORDER BY age, name;
QUIT;
Nate Derby PROC SQL: Why and How 19 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Log Output
Since all within one PROC, we get timing for entire block of code:
Nate Derby PROC SQL: Why and How 20 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
The STIMER Option
The STIMER gives times for each step within PROC SQL:
PROC SQL Code
PROC SQL NOPRINT STIMER;
CREATE TABLE blah AS
SELECT sex, mean( age ) as mage
label='Mean Age by Sex' format=6.2
FROM sashelp.class
GROUP BY sex;
CREATE TABLE final AS
SELECT a.name, a.sex, a.age, b.mage
FROM sashelp.class a join blah b
ON a.sex = b.sex
ORDER BY age, name;
QUIT;
Nate Derby PROC SQL: Why and How 21 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Log Output
Now we get listings for each step:
Nate Derby PROC SQL: Why and How 22 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Use Different Table Names
Valid in SAS code, but PROC SQL complains:
PROC SQL Code
PROC SQL NOPRINT STIMER;
CREATE TABLE blah AS
...;
CREATE TABLE blah AS
SELECT a.name, a.sex, a.age, b.mage
FROM sashelp.class a join blah b
ON a.sex = b.sex
ORDER BY age, name;
QUIT;
Nate Derby PROC SQL: Why and How 23 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
ALTER, UPDATE, DELETE Statements
ALTER TABLE statement changes columns:
PROC SQL Code
PROC SQL NOPRINT STIMER;
CREATE TABLE blah AS
SELECT
*
FROM sashelp.class;
ALTER TABLE blah
DROP age
MODIFY height FORMAT=6.2, weight FORMAT=6.2;
QUIT;
The ALTER TABLE statement cannot be used to change the data!
Nate Derby PROC SQL: Why and How 24 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
ALTER, UPDATE, DELETE Statements
UPDATE statement changes data, DELETE FROM deletes rows:
PROC SQL Code
PROC SQL NOPRINT STIMER;
CREATE TABLE blah AS
SELECT
*
FROM sashelp.class;
UPDATE blah
SET weight
WHERE name = 'Robert';
DELETE FROM blah
WHERE name = 'Judy';
QUIT;
UPDATE, DELETE FROM cannot be used to modify the columns!
Nate Derby PROC SQL: Why and How 25 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Views and Indexes
SQL was developed for databases, so PROC SQL works really well
with views and indexes:
A view a “virtual table” = executable instructions for a table,
but holds no data.
Great for getting updated info, restricting access.
An index is one or more columns used to identify each row
within a table
Great for cutting down on time for processing data in (very)
large tables.
More info: See Kirk Lafler’s book!
Nate Derby PROC SQL: Why and How 26 / 30
Why?
How?
Conclusions
Introduction, Examples
Working with SAS Data Sets
Other Aspects
Accessing a Database
Surprise! PROC SQL can access a database really well.
Weird: No one writes about this ... except Katherine Prairie.
Two ways:
Pass-Through facility:
Uses ODBC (external to SAS).
Nests non-SAS SQL code within SAS code.
Thus, much SAS functionality can’t be used. Sometimes have
to be creative with SAS code to make non-SAS code.
Also: Messy code!
SAS/ACCESS and the LIBNAME statement: Much cleaner
code, allows using SAS functionality with the database.
Nate Derby PROC SQL: Why and How 27 / 30
Why?
How?
Conclusions
Conclusions
PROC SQL can be a powerful tool!
Very flexibly queries (often more flexible than the standard
SAS DATA steps).
Sometimes easier than in standard SAS code.
Facilitates program portability.
Allows for views, indexes and integrity constraints.
Quicker than SAS code for large data sets.
For very large data sets, sometimes SAS code crashes, SQL
code doesn’t.
Nate Derby PROC SQL: Why and How 28 / 30
Appendix
Further Resources
Pete Lund.
An Introduction to SQL in SAS.
http://pugsug.org (Presentations), 2010.
Howard Schreier.
PROC SQL by Example: Using SQL within SAS.
SAS Press, 2008.
Kirk Lafler.
PROC SQL: Beyond the Basics Using SAS.
SAS Press, 2004.
Katherine Prairie.
The Essential PROC SQL Handbook.
SAS Press, 2005.
Nate Derby PROC SQL: Why and How 29 / 30
Appendix
Further Resources
Nate Derby: http://nderby.org
Nate Derby PROC SQL: Why and How 30 / 30