This topic shows how to measure the cost of a
Derby-style table
function.
The following formula describes how to estimate the value returned
by VTICosting.getEstimatedCostPerInstantiation():
C = I * A
where
- C = The estimated Cost for creating and running the
table function. That is, the value returned by
VTICosting.getEstimatedCostPerInstantiation().
In general, Cost is a measure of time in milliseconds.
- I = The optimizer's Imprecision. A measure of how skewed the optimizer's estimates tend
to be in your particular environment. See below for instructions on how to estimate this Imprecision.
- A = The Actual time in milliseconds which it takes
to create and run this table function.
Calculating the optimizer's imprecision
We treat optimizer Imprecision as a constant across the
runtime environment. The following formula describes it:
I = O / T
where
- O = The Optimizer's estimated cost for a plan.
- T = The Total runtime in milliseconds for the plan.
To estimate these values, turn on
Derby statistics collection
and run the following experiment several times, averaging the results:
- Select = Select all of the rows from a big table.
- Record = In the statistics output, look for the ResultSet
which represents the table scan. That scan has a field
labelled "optimizer estimated cost". That's O. Now
look for the fields in that ResultSet's statistics labelled
"constructor time", "open time", "next time", and "close time". Add up
all of those fields. That total is T.
For example:
MAXIMUMDISPLAYWIDTH 7000;
CALL SYSCS_UTIL.SYSCS_SET_RUNTIMESTATISTICS(1);
CALL SYSCS_UTIL.SYSCS_SET_STATISTICS_TIMING(1);
select * from T;
values SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
Calculating the actual runtime cost of a table function
The following formula explains how to compute the Actual
runtime cost for the table function:
A = ( P * N ) + E
where
- P = The runtime spent Per row (in milliseconds).
- N = The Number of rows in the table function.
- E = The time spent creating an Empty instance of the
table function which has no rows in it. Usually, P * N dwarfs
E. That is, the table function instantiation cost is very
small compared to the actual cost of looping through the
rows. However, for some table functions, E may be significant
and may dominate the table function's cost when N is small.
You may know that E is basically 0. If so, you can skip this step.
Otherwise, to estimate E, turn on
Derby statistics collection
and run the following experiment several times, averaging the results:
- Short-circuit = Short-circuit the next() method of the
ResultSet
returned by your Derby-style
table function so that it returns
false the first time it is called. This makes it
appear that the
ResultSet
has no rows.
- Select = Select all of the rows from the table function.
- Record = In the statistics output, look for the VTIResultSet
which represents the table function scan. Add up
the values of the fields in that VTIResultSet's statistics labelled
"constructor time", "open time", "next time", and "close time".
That total is E.
To estimate P, turn on
Derby statistics collection
and run the following experiment several times, averaging the results:
- Select = Select all of the rows from the table function.
- Record = In the statistics output, look for the VTIResultSet
which represents the table function scan. Add up
the values of the fields in that VTIResultSet's statistics labelled
"constructor time", "open time", "next time", and "close
time". Subtract E from the result. Now divide by the
value of the field "Rows seen".
The result is P.
Computing the value returned by getEstimatedCostPerInstantiation()
Putting all of this together, the following formula describes the
value returned by your
table function's
VTICosting.getEstimatedCostPerInstantiation()
method.
C = O/T * [ ( P * N ) + E ]