Delphi Magazine-Getting Started With SQL Part 2.pdf

(83 KB) Pobierz
Surviving Client/Server: Getting Started With SQL Part 2
Surviving Client/Server:
Getting Started With SQL Part 2
by Steve Troxell
issue, we took the plunge into
SQL and got a crash course in the
basic operations of SELECT,
INSERT, UPDATE and DELETE . You
may have noticed that SELECT was a
pretty powerful statement since
over half of the material covered
that statement alone. Well, we’re
going to cover even more about
SELECT and we still won’t have got-
ten to all of it. This time around
we’re going to learn about totalling
and subtotalling, removing dupli-
cates, grouping data, conditional
calculations, and joining unrelated
tables. These are some of the more
practical tools used to create
useful reports out of SQL data.
As before, we will use the sample
employee database that ships with
Delphi for all our examples. You
may want to try the examples with
ISQL as you read the text. The
sample database can be found in
\IBLOCAL\EXAMPLES\EMPLOYEE.GDB .
To get started with ISQL check out
the Local Interbase Server User’s
Guide, or see the last issue’s
column for instructions.
SELECT MAX(salary),
MIN(salary), AVG(salary)
FROM employee;
MAX MIN AVG
================ ================ ================
99000000.00 22935.00 2750534.95
ä
Figure 1
SELECT MAX(salary), MIN(salary), AVG(salary) FROM employee
WHERE dept_no = 621;
MAX MIN AVG
================ ================ ================
975000.00 36000.00 69184.88
ä
Figure 2
Note that the result set returned
by the query consists of a single
row with a single column (the SUM
column). The aggregate function
operates on each row specified by
the query, but only the final (aggre-
gate) answer is returned by the
query. There are five aggregate
functions in SQL:
ã
but it can also be an expression.
For example, the following query
returns the average unit price for
all orders in the sales table:
SELECT AVG(total_value /
qty_ordered) FROM sales;
SUM() The total of the values in
the argument,
AVG
======================
4563.823426619132
ã
AVG() The average of the values
in the argument,
In addition, the COUNT() function
can include a special argument.
COUNT(columnname) returns the
number of values in the column
excluding nulls . However, you can
use COUNT(*) to return the total
number of rows in the result re-
gardless of any nulls in any column.
The SUM(), AVG() and COUNT()
aggregate functions can include
the DISTINCT keyword within the
parentheses to force the function
to ignore any duplicate values in
the column. If you wanted to know
the number of unique job codes in
the employee table, a regular
COUNT(job_code) would return the
total number of employees (each
one has a value in the job_code
column). But COUNT(DISTINCT
job_code) would return the correct
Aggregate Functions
Aggregate functions are among the
more powerful features of SQL.
When used appropriately, they can
shift a great deal of reporting work
from the client application onto the
database server. Aggregate
functions return a single value for
a given set of rows. For example,
the query
ã
COUNT() The number of non-null
values in the argument,
ã
MAX() The greatest value in the
argument,
MIN() The least value in the
argument.
To find the highest salary, lowest
salary, and average salary for the
company, use the query shown in
Figure 1.
Aggregate functions work on any
set you can query with a SELECT
statement. For example, if you
wanted to see the salary figures for
a single department instead of the
whole company, just add a WHERE
clause to the query, as shown in
Figure 2.
The argument to an aggregate
function is usually a table column,
SELECT SUM(salary) FROM
employee;
returns the total of the salaries for
all employees:
SUM
================
115522468.00
14
The Delphi Magazine
Issue 4
A s you may recall from the last
ã
287640204.002.png
number of different job codes
regardless of how many employees
were assigned to the same code:
SELECT customer FROM customer, sales
WHERE paid = ’n’ AND sales.cust_no = customer.cust_no
ORDER BY customer;
SELECT COUNT(job_code),
COUNT(DISTINCT job_code)
FROM employee;
CUSTOMER
=========================
3D-Pad Corp.
3D-Pad Corp.
Anini Vacation Rentals
Anini Vacation Rentals
Buttle, Griffith and Co.
DT Systems, LTD.
Dallas Technologies
Dallas Technologies
GeoTech Inc.
Lorenzi Export, Ltd.
Max
Signature Design
Signature Design
Signature Design
COUNT COUNT
=========== ===========
42 13
Eliminating Duplicate Rows
Consider the problem of creating a
report of all the customers who
have not yet paid for their orders.
To do this, we must look at the
sales table to see which orders
have not been paid, and join that
with the customer table to get the
name of the customer:
ä
Figure 3
SELECT DISTINCT customer FROM customer, sales
WHERE paid = ’n’ AND sales.cust_no = customer.cust_no
ORDER BY customer;
SELECT customer FROM customer,
sales
WHERE paid = ’n’ AND
sales.cust_no =
customer.cust_no
ORDER BY customer;
CUSTOMER
=========================
3D-Pad Corp.
Anini Vacation Rentals
Buttle, Griffith and Co.
DT Systems, LTD.
Dallas Technologies
GeoTech Inc.
Lorenzi Export, Ltd.
Max
Signature Design
This statement selects all the
unpaid orders from sales, deter-
mined by the value of the paid
column. The cust_no column is the
link we use to join the sales and
customers tables. Take a look at
the result set for this query shown
in Figure 3. Notice that several
customers are listed more than
once. This is because our query
returns a row for every unpaid
order, and these customers have
more than one order that has not
been paid. To make the query
return a single row for every
customer with an unpaid order,
regardless of how many unpaid
orders they have, we have to get
rid of the duplicate rows. To do this
we use the DISTINCT keyword
to return only distinct (non-
duplicating) rows (see Figure 4):
ä
Figure 4
automatically eliminate any dupli-
cation of rows. Duplication is
defined as identical values for all
columns in the select list, not just
the first one. For example, suppose
we wanted to add the amount of
each order that has not been paid
to our report:
total amount, then those entries
would be combined into one.
Be sure to note the difference in
DISTINCT when used in the select
list (as shown above) versus when
used within an aggregate function.
In a select list, DISTINCT eliminates
duplicate rows from the display. In
an aggregate function, DISTINCT
eliminates duplicate values (not
the entire row) from the aggregate
computation.
SELECT DISTINCT customer,
total_value FROM
customer, sales
WHERE paid = ’n’ AND
sales.cust_no =
customer.cust_no
ORDER BY customer;
Grouping Data
Since the query shown above does
not work, how do we obtain a non-
duplicating list of customers and
the total of their unpaid orders?
Well, “total of unpaid orders”
implies a calculation and for that
we need to use an aggregate func-
tion. But until now, we’ve only used
aggregate functions to compute a
result for the entire set returned by
the SELECT . In this case we need to
compute a new result for each set
of rows for a given customer. To
SELECT DISTINCT customer FROM
customer, sales
WHERE paid = ’n’ AND
sales.cust_no =
customer.cust_no
ORDER BY customer;
Take a look at the results shown in
Figure 5. We have duplication of
the customer name again, even
though we used the DISTINCT
keyword. That’s because DISTINCT
operates on all the columns of the
select list (in this case customer and
total_value ). In fact, if it happened
that the same customer had two or
more unpaid orders with the same
By including the DISTINCT keyword
in any SELECT query, we can
November 1995
The Delphi Magazine
15
287640204.003.png
accomplish this, we need to use a
GROUP BY clause.
GROUP BY is another mechanism
for removing duplicate rows. We
could have reworked our query in
Figure 4 to use GROUP BY to produce
the same result set:
SELECT DISTINCT customer, total_value FROM customer, sales
WHERE paid = ’n’ AND sales.cust_no = customer.cust_no
ORDER BY customer;
CUSTOMER TOTAL_VALUE
========================= ===========
3D-Pad Corp. 999.98
3D-Pad Corp. 10000.00
Anini Vacation Rentals 9000.00
Anini Vacation Rentals 16000.00
Buttle, Griffith and Co. 0.00
DT Systems, LTD. 9000.00
Dallas Technologies 14850.00
Dallas Technologies 20000.00
GeoTech Inc. 1500.00
Lorenzi Export, Ltd. 2693.00
Max 490.69
Signature Design 3399.15
Signature Design 60000.00
Signature Design 422210.97
SELECT customer FROM
customers, sales
WHERE paid = ’n’ AND
sales.cust_no =
customer.cust_no
GROUP BY customer;
Like DISTINCT, GROUP BY will return
only a single row for each group of
rows that match on the fields listed
after GROUP BY . But GROUP BY is far
more useful than this. We can use
this clause in combination with
aggregate functions to perform
basic calculations on the set of
rows represented by each row in
the result set. To get our list of
outstanding customers, we simply
use the query shown in Figure 6.
Notice that although only one
row is shown for each customer,
the aggregate function SUM() still
uses all the rows for each customer
to make the calculation (that is,
all the rows the WHERE clause
permitted).
Another example of how this
might be useful to us would be a
sales report showing number of
sales, total sales, and average sale
by salesman. The query in Figure 7
produces such a report.
Notice how the output of the
aggregate functions in this query
differs from what we had when we
used them in a regular SELECT state-
ment (without a GROUP BY clause).
The GROUP BY clause creates sets of
rows with matching values, but
only shows one row for each set in
the final output. When an aggregate
function is used in this type of
SELECT statement, it returns a value
for each group of rows, instead of
a value for the entire result set. In
fact, GROUP BY and aggregate
functions are most often used
together.
GROUP BY can begin to cause prob-
lems when you have more than one
table column in the select list.
Although there are four columns in
the output of the query shown
ä
Figure 5
SELECT customer, SUM(total_value)
FROM customer, sales
WHERE paid = ’n’ AND sales.cust_no = customer.cust_no
GROUP BY customer;
CUSTOMER SUM
========================= ===========
3D-Pad Corp. 10999.98
Anini Vacation Rentals 25000.00
Buttle, Griffith and Co. 0.00
DT Systems, LTD. 9000.00
Dallas Technologies 34850.00
GeoTech Inc. 1500.00
Lorenzi Export, Ltd. 2693.00
Max 490.69
Signature Design 485610.12
ä
Figure 6
SELECT full_name,
COUNT(*) AS num_sales,
SUM(total_value) AS total_sales,
AVG(total_value) AS avg_sale
FROM sales, employee
WHERE sales.sales_rep = employee.emp_no
GROUP BY full_name;
FULL_NAME NUM_SALES TOTAL_SALES AVG_SALE
=========================== =========== =========== ===========
Ferrari, Roberto 2 122693.00 61346.50
Glon, Jacques 5 462600.49 92520.10
Leung, Luke 5 37475.69 7495.14
Osborne, Pierre 1 1980.72 1980.72
Sutherland, Claudia 3 960008.00 320002.67
Weston, K. J. 8 139450.50 17431.31
Yamamoto, Takashi 3 24190.40 8063.47
Yanowski, Michael 6 502192.23 83698.71
ä
Figure 7
above, only one of them, full_name ,
is a table column, the other three
are aggregate function calcula-
tions. Take a look at this query:
This doesn’t work (Interbase
produces an “invalid column refer-
ence” error). The rows are
arranged by sales_rep and, like
DISTINCT , the rows must match on
all of the columns in the select list.
In this case, each salesman is likely
to have different order quantities
SELECT sales_rep, qty_ordered
FROM sales
GROUP BY sales_rep;
16
The Delphi Magazine
Issue 4
287640204.004.png
on each row, making them unique
rows that cannot be grouped
together. You must use GROUP BY
sales_rep, qty_ordered to make
this query work. However, this
returns a row for each unique
salesman/quantity combination,
not each unique salesman. That
may or may not be what you were
looking for.
Some SQL servers require that
all the table columns in the select
list (excluding aggregate func-
tions) appear in the GROUP BY clause
as well, even if there is a one-to-one
relationship in the values of all the
selected columns. For example, the
query:
SELECT full_name,
COUNT(*) AS num_sales,
SUM(total_value) AS total_sales,
AVG(total_value) AS avg_sale
FROM sales, employee
WHERE sales.sales_rep = employee.emp_no
GROUP BY full_name
HAVING SUM(total_value) > 250000;
FULL_NAME NUM_SALES TOTAL_SALES AVG_SALE
=========================== =========== =========== ===========
Glon, Jacques 5 462600.49 92520.10
Sutherland, Claudia 3 960008.00 320002.67
Yanowski, Michael 6 502192.23 83698.71
ä
Figure 8
SELECT full_name,
COUNT(*) AS num_sales,
SUM(total_value) AS total_sales,
AVG(total_value) AS avg_sale
FROM sales, employee
WHERE sales.sales_rep = employee.emp_no AND
paid = ’y’
GROUP BY full_name
HAVING SUM(total_value) > 250000;
SELECT sales_rep, full_name,
COUNT(*) as num_sales
FROM sales, employee
WHERE sales.sales_rep =
employee.emp_no;
GROUP BY sales_rep;
FULL_NAME NUM_SALES TOTAL_SALES AVG_SALE
=========================== =========== =========== ===========
Glon, Jacques 2 450100.51 225050.26
Sutherland, Claudia 3 960008.00 320002.67
is illegal in Interbase, even though
each instance of duplication in the
sales_rep column is matched
exactly by duplication in the
full_name column (thereby con-
ceptually allowing them to be
grouped together). Interbase
requires all non-aggregate function
items in the select list also appear
in the GROUP BY clause.
ä
Figure 9
which groups to include in the
output . The query in Figure 8 uses a
HAVING clause to produce the report
we want.
In this case the WHERE clause only
serves to make the join between
the sales and employee tables; it
does not filter any rows. The HAVING
clause limits the groups that are
returned by the query,
Let’s do the same sales report
again, but this time let’s only
include paid orders. The query and
results are shown in Figure 9.
Notice in the results that the
aggregate values are different.
That’s because the WHERE clause
threw out some rows before they
got grouped (and hence aggre-
gated) by GROUP BY . Also notice that
we now have one less row because
one salesman’s new aggregate total
falls below our threshold of
$250,000.
department (dept 621) will receive
a 20% increase and everyone in the
Marketing department (dept 180)
will receive a 5% increase [Now
that looks like a company with real
vision! Editor] . Recall from last
issue’s column that we can easily
include a calculation within a
SELECT statement to compute the
new salary for each employee. But
in this case, we need to apply one
of two different calculations to the
same column depending upon
which department the employee is
in. We could do this with two sepa-
rate queries, but we’d rather have
a nice, clean, single output result
set for employees from both
departments. To do this, we can
combine the results from our two
queries in a union by connecting
the SELECT statements with the
UNION keyword, as in Figure 10.
Take a look at the output for this
query. As you can see, the salary
calculations were performed as
desired depending on department.
A union simply combines the
output from multiple SELECT s into a
single result set. And unions are
not limited to just two SELECT s; you
Limiting Groups
Given our sales report shown in
Figure 7, suppose we wanted to
restrict the report to salesmen
having more than $250,000 in sales.
Our first instinct might be to add
the clause WHERE SUM(total_value)
> 250000 to our query, but this
produces an error because aggre-
gate functions are not allowed
within a WHERE clause. This is sensi-
ble because WHERE prevents rows
that don’t match the criteria from
being processed, yet aggregate
functions must process the rows to
get an aggregate answer. It’s a
chicken-or-egg paradox: you can’t
determine which rows to process
based on a function that requires
processing the rows.
The solution is to use the HAVING
clause. Whereas WHERE determines
which rows to include in the
processing , HAVING determines
Unions
Let’s say it’s annual performance
review time and you want a report
to project the planned salary in-
creases for two departments. Eve-
ryone in the Software Development
November 1995
The Delphi Magazine
17
287640204.005.png
can string several SELECT s together
with a UNION between each. How-
ever, be wary of using unions on
large tables. Each SELECT within the
union is performed independently
and the results combined, so you
could take a performance hit due to
repetitive table traversal, particu-
larly if the SELECT cannot take
advantage of an index.
The most significant thing to
notice about union output is that it
does not contain all the rows from
the first SELECT followed by all the
rows from the second SELECT (as
shown by the dept_no column in
Figure 10). In fact, the rows have
been ordered on the first column
even though we did not include an
ORDER BY clause. Interbase automat-
ically orders results from a union;
some other SQL servers do not. If
you want to force the ordering of a
union, you simply add an ORDER BY
clause to the last SELECT in the un-
ion. The ORDER BY will be applied to
the entire result set. In Interbase,
when the ORDER BY clause is used
within a union, you must specify
the columns to order on by
number, as in ORDER BY 2 . Other
SQL servers allow column names
to be used as well.
One more thing to notice in
Figure 10 is that we did not give a
column heading to the new salary
calculation. In a regular non-union
SELECT statement we would use a
column alias such as salary * 1.20
as new_salary . Unfortunately, for
some peculiar reason, Interbase
ignores column aliases in unions.
Also, I have seen cases of unions
containing more than two SELECT s
where all of the column names are
mysteriously dropped from the
output.
Not only can a union be used to
perform conditional calculations
on a single table, as demonstrated
in our previous example, but we
can also use a union to pull rows
from different tables into the same
result set. This is not the same as a
join because there is not necessar-
ily any relationship between the
separate tables. For example, let’s
suppose our company is throwing
a product launch party. We’re go-
ing to invite employees and cus-
tomers to the party and we need an
SELECT full_name, dept_no, salary, salary * 1.20
FROM employee
WHERE dept_no = 621
UNION
SELECT full_name, dept_no, salary, salary * 1.05
FROM employee
WHERE dept_no = 180;
FULL_NAME DEPT_NO SALARY
====================== ======= =============== ======================
Bishop, Dana 621 62550.00 75060
Green, T.J. 621 36000.00 43200
Johnson, Leslie 180 64635.00 67866.75
Nordstrom, Carol 180 42742.50 44879.625
Ramanathan, Ashok 621 80689.50 96827.39999999999
Young, Bruce 621 97500.00 117000
ä
Figure 10
SELECT first_name, last_name, ’empl’ FROM employee
UNION
SELECT contact_first, contact_last, ’cust’ FROM customer
FIRST_NAME LAST_NAME
=============== ==================== ======
Andreas Lorenzi cust
Ann Bennet empl
Ashok Ramanathan empl
Bill Parker empl
Bruce Young empl
Carol Nordstrom empl
Chris Papadopoulos empl
Claudia Sutherland empl
Dale J. Little cust
Dana Bishop empl
Elizabeth Brocket cust
Glen Brown cust
Greta Hessels cust
Jacques Glon empl
James Buttle cust
Janet Baldwin empl
Jennifer M. Burbank empl
John Montgomery empl
K. J. Weston empl
K.M. Neppelenbroek cust
ä
Figure 11 (partial listing of rows)
invitation list. This will entail a
SELECT from both the employee
table and the customer table (see
Figure 11):
data types. Notice that the actual
column names do not have to be
the same; the column names from
the first SELECT are used as the
column names for the entire output
of the union.
In this query we included a literal
in each SELECT to identify whether
the person on the invitation list is
an employee or a customer. Al-
though you can place literals in any
SELECT statement, when used in a
union, you must make sure that the
literals in each SELECT are the same
length. If not, Interbase issues a
“data type unknown” error; the
same error that occurs if you use
columns of different data types.
SELECT first_name, last_name,
’empl’ FROM employee
UNION
SELECT contact_first,
contact_last, ’cust’
FROM customer
This query returns all the rows
from employee and all the rows
from customer combined into a
single result set. There is no linkage
of the rows between tables as
you’d have in a join. The only
requirements are that each of the
SELECT s must have the same num-
ber of items in the select list and
those items must be of compatible
Ordering Clauses
Finally, I’d like to point out a minor
but very important detail. When
18
The Delphi Magazine
Issue 4
287640204.001.png
Zgłoś jeśli naruszono regulamin