Flattening a subquery into an EXISTS join

An EXISTS join is a join in which the right side of the join needs to be probed only once for each outer row. Using such a definition, an EXISTS join does not literally use the EXISTS keyword. Derby treats a statement as an EXISTS join when there will be at most one matching row from the right side of the join for a given row in the outer table.

A subquery that cannot be flattened into a normal join because of a uniqueness condition can be flattened into an EXISTS join if it meets all the requirements (see below). Recall the first example from the previous section (Flattening a subquery into a normal join):

SELECT c1 FROM t1 WHERE c1 IN (SELECT c1 FROM t2)

This query could not be flattened into a normal join because such a join would return the wrong results. However, this query can be flattened into a join recognized internally by the Derby system as an EXISTS join. When processing an EXISTS join, Derby knows to stop processing the right side of the join after a single row is returned. The transformed statement would look something like this:

SELECT c1 FROM t1, t2
WHERE t1.c1 = t2.c1
EXISTS JOIN INTERNAL SYNTAX

Requirements for flattening into an EXISTS join:

The subquery is not under an OR.
The subquery type is EXISTS, IN, or ANY.
The subquery is not in the SELECT list of the outer query block.
There are no aggregates in the SELECT list of the subquery.
The subquery does not have a GROUP BY clause.
The subquery has a single entry in its FROM list that is a base table.
None of the predicates in the subquery, including the additional one formed between the left side of the subquery operator and the column in the subquery's SELECT list (for IN or ANY subqueries), include any subqueries, method calls, or field accesses.

When a subquery is flattened into an EXISTS join, the table from the subquery is made join-order-dependent on all the tables with which it is correlated. This means that a table must appear inner to all the tables on which it is join-order-dependent. (In subsequent releases this restrictions can be relaxed.) For example:

SELECT t1.* FROM t1, t2
WHERE EXISTS (SELECT * FROM t3 WHERE t1.c1 = t3.c1)

gets flattened into

SELECT t1.* FROM t1, t2, t3 WHERE t1.c1 = t3.c1

where t3 is join order dependent on t1. This means that the possible join orders are (t1, t2, t3), (t1, t3, t2), and (t2, t1, t3).