Optimizer support for Derby-style table functions

This topic explains how to fine-tune the Derby optimizer's decision about where to place a table function in the join order.

By default, the Derby optimizer makes the following assumptions about a table function:

Expensive - It is expensive to create and loop through the rows of the table function. This makes it likely that the optimizer will place the table function in an outer slot of the join order so that it will not be looped through often.
Repeatable - The table function can be instantiated multiple times with the same results. This is probably true for most table functions. However, some table functions may open read-once streams. If the optimizer knows that a table function is repeatable, then the optimizer can place the table function in an inner slot where the function can be invoked multiple times. If a table function is not repeatable, then the optimizer must either place it in the outermost slot or invoke the function once and store its contents in a temporary table.

The user can override this optimizer behavior by giving the optimizer more information. Here's how to do this:

No-arg constructor - The table function's class must have a public constructor whose signature has no arguments.
VTICosting - The class must also implement org.apache.derby.vti.VTICosting. This involves implementing the following methods as described in Measuring the cost of Derby-style table functions and Example VTICosting implementation:
- getEstimatedCostPerInstantiation() - This method estimates the cost of invoking the table function and looping through its rows. The returned value adds together two estimates:
  - Empty table - This is the cost of invoking the table function, even if it contains 0 rows. See the description of variable E in Measuring the cost of Derby-style table functions.
  - Scanning - This is the cost of looping through all of the rows returned by the table function. See the calculation of P*N in Measuring the cost of Derby-style table functions.
- getEstimatedRowCount() - This guesses the number of rows returned by invoking the table function.
- supportsMultipleInstantiations() - This returns false if the table function returns different results when invoked more than once.

For more information, see: