MANY-TABLE JOINS IN MYSQL - BACKGROUND
Data held in SQL tables should be
normalised - in other words, held in neat multiple tables with complete rows, only one piece of logical data per cell, and with information not being repeated in multiple places. (The "why" is off topic for this article, but it basically helps data maintenance and integrity no end).
Multiple normalised tables can be linked together within select commands and this linking is known as joining; when you specifiy a join, you also specify a criteria to tell MySQL how to make the connection, and that's typically done using a key. Let's see a simple example.
Two tables - bdg containing buildings ....
-------+-----+
| name | bid |
-------+-----+
| 404 | 1 |
| 405 | 2 |
-------+-----+
... and res containing residents living there.
---------+------+-----+
| person | bid | rid |
---------+------+-----+
| Graham | 1 | 101 |
| Lisa | 1 | 102 |
---------+------+-----+
When I connect (join) those tables together, I wish to do so by linking the "bid"s - and the syntax I use is:
select * from bdg, res where bdg.bid = res.bid ; You'll notice that I DON'T use the word join (I could ... but that's another story). Here's my output:
-------+-----+--------+------+-----+
| name | bid | person | bid | rid |
-------+-----+--------+------+-----+
| 404 | 1 | Graham | 1 | 101 |
| 404 | 1 | Lisa | 1 | 102 |
-------+-----+--------+------+-----+
Which is good - in other words, it's what I expected. BUT ... it might be that I want to see at least one row on my report for each of the incoming rows in (say) my building table - to alert me to buildings that don't match any resident records at all. Than can be done using a LEFT JOIN in my select:
select * from bdg left join res on bdg.bid = res.bid ; which gives:
-------+-----+--------+------+------+
| name | bid | person | bid | rid |
-------+-----+--------+------+------+
| 404 | 1 | Graham | 1 | 101 |
| 404 | 1 | Lisa | 1 | 102 |
| 405 | 2 | NULL | NULL | NULL |
-------+-----+--------+------+------+
THREE WAY JOINS
Regular joins and left joins can be extended to three and more tables - the principle is easy but the syntax less so; let's say that we had a third table called dom containing the names of any internet domains registered to each individual:
------------------------+------+-----+
| domain | rid | did |
------------------------+------+-----+
| www.grahamellis.co.uk | 101 | 201 |
| www.sheepbingo.co.uk | 101 | 202 |
------------------------+------+-----+
A regular join on the (now) three tables is straightforward:
select * from bdg, res, dom where bdg.bid = res.bid and res.rid = dom.rid; and gives the following result:
-------+-----+--------+------+-----+-----------------------+------+-----+
| name | bid | person | bid | rid | domain | rid | did |
-------+-----+--------+------+-----+-----------------------+------+-----+
| 404 | 1 | Graham | 1 | 101 | www.grahamellis.co.uk | 101 | 201 |
| 404 | 1 | Graham | 1 | 101 | www.sheepbingo.co.uk | 101 | 202 |
-------+-----+--------+------+-----+-----------------------+------+-----+
The syntax for a three way LEFT JOIN is more complex (and thus the inspiration for this article):
select * from (bdg left join res on bdg.bid = res.bid) left join dom on res.rid = dom.rid; and gives the following result:
-------+-----+--------+------+------+-----------------------+------+------+
| name | bid | person | bid | rid | domain | rid | did |
-------+-----+--------+------+------+-----------------------+------+------+
| 404 | 1 | Graham | 1 | 101 | www.grahamellis.co.uk | 101 | 201 |
| 404 | 1 | Graham | 1 | 101 | www.sheepbingo.co.uk | 101 | 202 |
| 404 | 1 | Lisa | 1 | 102 | NULL | NULL | NULL |
| 405 | 2 | NULL | NULL | NULL | NULL | NULL | NULL |
-------+-----+--------+------+------+-----------------------+------+------+
Notice that our report now includes orphan records at both join levels - entries in the bdg table that have no corresponding entry in the res table, and entries in the res table that have no corresponding entry in the dom table.
THREE WAY JOINS - LOOKING FOR INCOMPLETE RECORDS
Should we wish to report on orphan records only, we can do so by testing for NULL fields in fields that may not otherwise have a null value.
Example - looking for all incomplete records:
select * from (bdg left join res on bdg.bid = res.bid) left join dom on res.rid = dom.rid where dom.rid is NULL; -------+-----+--------+------+------+--------+------+------+
| name | bid | person | bid | rid | domain | rid | did |
-------+-----+--------+------+------+--------+------+------+
| 404 | 1 | Lisa | 1 | 102 | NULL | NULL | NULL |
| 405 | 2 | NULL | NULL | NULL | NULL | NULL | NULL |
-------+-----+--------+------+------+--------+------+------+
Example - looking for all buildings with no residents:
select * from (bdg left join res on bdg.bid = res.bid) left join dom on res.rid = dom.rid where res.rid is NULL; -------+-----+--------+------+------+--------+------+------+
| name | bid | person | bid | rid | domain | rid | did |
-------+-----+--------+------+------+--------+------+------+
| 405 | 2 | NULL | NULL | NULL | NULL | NULL | NULL |
-------+-----+--------+------+------+--------+------+------+
(Hey - you really don't need to join in the domain table for this)
Example - looking for all residents with no domains:
select * from (bdg left join res on bdg.bid = res.bid) left join dom on res.rid = dom.rid where dom.rid is NULL and res.rid is not NULL; -------+-----+--------+------+------+--------+------+------+
| name | bid | person | bid | rid | domain | rid | did |
-------+-----+--------+------+------+--------+------+------+
| 404 | 1 | Lisa | 1 | 102 | NULL | NULL | NULL |
-------+-----+--------+------+------+--------+------+------+
SUMMARY OF MYSQL COMMANDS USED
Here's a complete set of the commands I used to set up this example - you're welcome to cut and paste it for your own testing and experimentation:
use test;
drop table if exists bdg;
drop table if exists res;
drop table if exists dom;
create table bdg (name text, bid int primary key);
create table res (person text, bid int, rid int primary key);
create table dom (domain text, rid int, did int primary key);
insert into bdg values ("404",1);
insert into res values ("Graham",1,101);
insert into dom values ("www.grahamellis.co.uk",101,201);
insert into dom values ("www.sheepbingo.co.uk",101,202);
insert into res values ("Lisa",1,102);
insert into bdg values ("405",2);
select * from bdg;
select * from res;
select * from dom;
select * from bdg, res where bdg.bid = res.bid ;
select * from bdg, res, dom where bdg.bid = res.bid and
res.rid = dom.rid;
select * from bdg left join res on bdg.bid = res.bid ;
select * from (bdg left join res on bdg.bid = res.bid)
left join dom on res.rid = dom.rid;
select * from (bdg left join res on bdg.bid = res.bid)
left join dom on res.rid = dom.rid where dom.rid is NULL;
select * from (bdg left join res on bdg.bid = res.bid)
left join dom on res.rid = dom.rid where res.rid is NULL;
select * from (bdg left join res on bdg.bid = res.bid)
left join dom on res.rid = dom.rid
where dom.rid is NULL and res.rid is not NULL;
See also
More (My)SQL commands