Joins in Oracle SQL


Joins allow the combination of selection of data from multiple tables and views. Note that a table can also be joined to itself, a self-join. The minimum number of joins required to relate a set of tables is the number of tables in the join less one. Thus a two table selection with no join returns a two table Cartesian Product of those two tables.

There are four different types of joins.

Equijoin

An equijoin uses any of the equality operators. An equijoin is usually used to display data from two or more tables which have common values existing in corresponding columns.

SELECT a.FIELD1, a.FIELD2, b.FIELD1, b.FIELD2
FROM TABLE1 a, TABLE2 b
WHERE a.FIELD1 = b.FIELD1;

Self-Join

A self-join joins a table to itself. Table aliases must be used.

SELECT a.FIELD1, a.FIELD2, b.FIELD1, b.FIELD2
FROM TABLE a, TABLE b
WHERE a.FIELD1 = b.FIELD1
AND a.FIELD2 = b.FIELD2;

Cartesian Product

Selecting from two tables without a WHERE clause and thus join condition results in a Cartesian Product. The number of rows in a Cartesian Product is the product of the rows in each table.

SELECT * FROM TABLE1, TABLE2;

Outer-Join

An outer join occurs when two tables are joined and one table does not have a matching row for a row in the other table. The outer join effectively is designed to resolve this problem, ie. how to retrieve the rows in one table not in the other as well as the interection between the two tables.

An outer join returns all rows satisfying the join condition in addition to all rows from one of the tables for which no rows from the other satisfy the join condition. (+) is the outer join operator and is placed on the side of the join statement deficient in information. The outer join operator can only be placed in the WHERE clause and applied to a column name. The outer join can not be used in an OR expression and can not be applied to a column using an IN condition or a subquery.

The SELECT statement shown below will retrieve the intersection of TABLE1 and TABLE2 in addition to all rows in TABLE2 which are not in TABLE1 based on the equality of FIELD1 and FIELD2 in both tables.

SELECT a.FIELD1, a.FIELD2, b.FIELD1, b.FIELD2
FROM TABLE1 a, TABLE2 b
WHERE a.FIELD1 = b.FIELD1(+)
AND a.FIELD2 = b.FIELD2(+);

Outer Join and UNIONs

Other techniques for handling the outer join problem is by use of the SQL UNION statement. The UNION statements allows the combination of two or more SELECT statements and their result row sets. Rows from each SELECT statement are stacked after each other and sorted and eliminate to remove duplicated rows.

In the following SQL statement the UNION statement is used to add a second SELECT statement where the results are combined into a single result row set.

SELECT D.DEPTNO, D.DNAME, COUNT(*)
FROM DEPT D, EMP E
WHERE D.DEPTNO = E.DEPTNO
GROUP BY D.DEPTNO, D.DNAME
UNION
SELECT DEPTNO, DNAME, 0
FROM DEPT
WHERE DEPTNO NOT IN
	(SELECT DISTINCT DEPTNO FROM EMP);

Note that an ORDER BY clause placed in a UNION of a set of SELECT statements must use the column relative integer format.

SELECT A, B, 1 FROM TABLE1 WHERE ...
UNION
SELECT X, Y, 2 FROM TABLE2 WHERE ...
ORDER BY 3,1;

Indexes and Joins

Indexes store information about the location of records in a database. Indexes can be used to speed up queries and certain aggregate functions. Joins can significanlty affect speed of database retrieval. Thus appropriate use of optimisation index matching is effective in increasing data retrieval speeds.

An index is a compact file containing information about the physical location of records in a database. Such indexes are actually pointers. In the absense of appropriate indexes the relational database will always scan a whole table or set of tabnles in a join. Indexes can assist SQL statements to locate database records more rapidly without requiring full table scans of data rows. Different indexes are chosen for use by the internal query optimiser. A database index is much like the indexing catalogue in a libary. It would take one much longer to find a book if one would have to search for a book by walking through the entire library and looking at every book.

Indexes will speed-up database retrieval but will slow database updating since more than one file or object is being updated for each database SQL alteration statement.

Indexes are particularly important in the case of complex queries, including those complex queries containing joins. A SELECT COUNT(*) FROM ... statement would be able to use any index in the selected from table. However, if a WHERE clause were used which joined two tables where one table had an applicaable index and the other table did not then the resulting join would lead to a full table-scan on the latter table. This join would be a slow database retrieval. An appropriate index on the latter table would speed up the join and thus the database retrieval speed.

Views vs Complex Joins

Views can simplify data access. A complex query can be defined as a view. Subsequently querying the view can save time and ensure accuracy in future.