SQL FAQ: JOINS
Cartesian Products
When you join tables, make sure that the number of
join predicates in the
search condition
is one less than the number of tables in the
from list.
Otherwise, you will get many more rows returned than you probably intended.
For example, table english and spanish look like this:
* select * from english; * select * from spanish;
----------------------------- -----------------------------
|tag |name | |tag |name |
----------------------------- -----------------------------
|1 |one | |2 |dos |
|2 |two | |3 |tres |
|3 |three | |4 |cuatro |
----------------------------- -----------------------------
3 rows selected 3 rows selected
If you select from both tables without joining them in the where
clause, you get a cartesian product,
every possible combination of both:
* select * from english, spanish;
---------------------------------------------------------
|tag |name |tag |name |
---------------------------------------------------------
|2 |dos |1 |one |
|3 |tres |1 |one |
|4 |cuatro |1 |one |
|2 |dos |2 |two |
|3 |tres |2 |two |
|4 |cuatro |2 |two |
|2 |dos |3 |three |
|3 |tres |3 |three |
|4 |cuatro |3 |three |
---------------------------------------------------------
9 rows selected
Most likely, this is not what you had in mind. Since there are two tables in
the from_list,
one join predicated is needed:
* select * from english, spanish
where english.tag = spanish.tag;
---------------------------------------------------------
|tag |name |tag |name |
---------------------------------------------------------
|2 |dos |2 |two |
|3 |tres |3 |three |
---------------------------------------------------------
2 rows selected
Inner and Outer Joins
A join between two tables does not include any rows from either table
that have no matching rows in the other.
This is called an inner join
and frequently causes confusion since fewer rows are returned than
the user expects.
For example, tables english and spanish look like this:
* select * from english; * select * from spanish;
----------------------------- -----------------------------
|tag |name | |tag |name |
----------------------------- -----------------------------
|1 |one | |2 |dos |
|2 |two | |3 |tres |
|3 |three | |4 |cuatro |
----------------------------- -----------------------------
3 rows selected 3 rows selected
When you join these two tables, you get
only the two rows that have the same tag:
* select e.name, e.tag, s.name
from english e, spanish s
where e.tag = s.tag;
-------------------------------------------
|name |tag |name |
-------------------------------------------
|two |2 |dos |
|three |3 |tres |
-------------------------------------------
2 rows selected
Row one in table english and row cuatro in table spanish
fall into the outer joins:
Joins
+--------------+
left outer ---> | one 1 |
| +--------------+
+--> | two | 2 : dos |
inner join | | | : |
+--> | three | 3 : tres |
+--------|- - -+ |
| 4 cuatro| <--- right outer
+--------------+
You can select outer join rows by using not exists.
This query fetches the row in english that is not in
spanish (the left outer join):
* select e.name as English, e.tag, '--no row --' as Spanish
from english e
where not exists
(select * from spanish s
where e.tag=s.tag);
-------------------------------------------
|English |tag |Spanish |
-------------------------------------------
|one |1 |--no row -- |
-------------------------------------------
one row selected
This query fetches the row in spanish that is not in
english (the right outer join):
* select '--no entry--' as English, s.tag, s.name as Spanish
from spanish s
where not exists
(select * from english e
where e.tag=s.tag);
-------------------------------------------
|English |tag |Spanish |
-------------------------------------------
|--no entry-- |4 |cuatro |
-------------------------------------------
one row selected
You can string all statements together with union:
* select e.name::text as English, e.tag, s.name::text as Spanish
from english e, spanish s
where e.tag = s.tag
union
select e.name::text, e.tag, '--no entry--'::text
from english e
where not exists
(select * from spanish s
where e.tag=s.tag)
union
select '--no entry--'::text, s.tag, s.name::text
from spanish s
where not exists
(select * from english e
where e.tag=s.tag)
order by 2;
-------------------------------------------
|English |tag |Spanish |
-------------------------------------------
|one |1 |--no entry-- |
|two |2 |dos |
|three |3 |tres |
|--no entry-- |4 |cuatro |
-------------------------------------------
4 rows selected
If you think this is a lot of trouble to retrieve outer join data,
there's another way to handle known joins in Illustra that will
factor in outer join data. Keep reading.
Solving Outer Joins in Illustra with ref()
Confusion with outer joins was described above.
This section looks at another way to resolve outer join confusions
in Illustra by using ref().
We start by creating the two tables like this and inserting data:
create table spanish of new type spanish_t
(name varchar(20),
tag integer);
create table english of new type english_t
(name varchar(20),
tag integer,
sname ref(spanish_t));
insert into english (name, tag) values ('one', 1);
insert into english (name, tag) values ('two', 2);
insert into english (name, tag) values ('three', 3);
insert into spanish (name, tag) values ('dos', 2);
insert into spanish (name, tag) values ('tres', 3);
insert into spanish (name, tag) values ('cuatro', 4);
Next we update the reference in english:
* update english
set sname = (select unique ref(s1) from spanish s1
where english.tag = s1.tag);
3 rows updated
* select * from english;
-------------------------------------------
|name |tag |sname |
-------------------------------------------
|one |1 |NULL |
|two |2 |202d.2001 |
|three |3 |202d.2002 |
-------------------------------------------
3 rows selected
Notice that the select from english returned the oid reference to
spanish. You can dereference that oid as follows:
* select name as english, tag, deref(sname).name as spanish from english;
-------------------------------------------
|english |tag |spanish |
-------------------------------------------
|one |1 |NULL |
|two |2 |dos |
|three |3 |tres |
-------------------------------------------
3 rows selected
We can also take it the opposite way by updating the spanish_t
type and spanish table as follows:
* alter type spanish_t
add column ename ref(english_t);
* update spanish
set ename = (select unique ref(e1) from english e1
where spanish.tag = e1.tag);
3 rows updated
* select name as spanish, tag, deref(ename).name as english from spanish;
-------------------------------------------
|spanish |tag |english |
-------------------------------------------
|dos |2 |two |
|tres |3 |three |
|cuatro |4 |NULL |
-------------------------------------------
3 rows selected
Finally, we can use union to select from both:
* select name as english, tag, deref(sname).name as spanish from english
union
select deref(ename).name as english, tag, name as spanish from spanish
order by 2;
-------------------------------------------
|english |tag |spanish |
-------------------------------------------
|one |1 |NULL |
|two |2 |dos |
|three |3 |tres |
|NULL |4 |cuatro |
-------------------------------------------
4 rows selected
Realize that if new rows are inserted into either table, the reference
must be set in the tables that references it.
Back to top level FAQ.
Last modified 30-June-94 (Illustra Rev 2.0.12)
Jean Anderson (jta@postgres.berkeley.edu)