Materialization

Materialization means that a subquery is evaluated only once. Several types of subqueries can be materialized.

Expression subqueries that are not correlated

A subquery can be materialized if it is a noncorrelated expression subquery. A correlated subquery is one that references columns in the outer query, and so has to be evaluated for each row in the outer query.

For example:

SELECT * FROM Staff
WHERE id = (SELECT MAX(manager) FROM Org)

In this statement, the subquery needs to be evaluated only once.

This type of subquery must return only one row. If evaluating the subquery causes a cardinality violation (if it returns more than one row), an exception is thrown when the subquery is run.

Subquery materialization is detected before optimization, which allows the Derby optimizer to see a materialized subquery as an unknown constant value. The comparison is therefore optimizable.

The original statement is transformed into the following two statements:

constant = SELECT MAX(manager) FROM Org
SELECT * FROM Staff
WHERE id = constant

The second statement is optimizable.

Subqueries that cannot be flattened

Materialization of a subquery can also occur when the subquery is nonflattenable and there is an equijoin between the subquery and another FROM table in the query.

For example:

SELECT i, a FROM t1, 
   (SELECT DISTINCT a FROM T2) x1  
WHERE t1.i = x1.a AND t1.i in (1, 3, 5, 7)

In this example, the subquery x1 is noncorrelated because it does not reference any of the columns from the outer query. The subquery is nonflattenable because of the DISTINCT keyword. Derby does not flatten DISTINCT subqueries. This subquery is eligible for materialization. Since there is an equijoin predicate between the subquery x1 and the table t1 (namely, t1.i = x1.a), the Derby optimizer will consider performing a hash join between t1 and x1 (with x1 as the inner operand). If that approach yields the best cost, Derby materializes the subquery x1 to perform the hash join. The subquery is evaluated only a single time, and the results are stored in an in-memory hash table. Derby then executes the join using the in-memory result set for x1.

Related concepts
Flattening a subquery into a normal join
Flattening a subquery into an EXISTS join
Flattening VALUES subqueries
DISTINCT elimination in IN, ANY, and EXISTS subqueries
IN/ANY subquery transformation