When Derby creates statistics for a table's index, it calculates and stores the following in the system tables.
For example, consider the primary key on the table FlightAvailability:
CONSTRAINT FLIGHTAVAILABILITY_PK Primary Key ( FLIGHT_ID, SEGMENT_NUMBER, FLIGHT_DATE)
For this index, Derby keeps the following information:
How does Derby use these two numbers -- the number of rows in a table and the cardinality of a particular key -- to determine the selectivity of a query? Take this example:
SELECT * FROM Flights, FlightAvailability WHERE Flights.flight_id = OtherTable.flight_id
If the cardinality for flight_id in Flights is 250, the selectivity of the predicate is 1/250. The optimizer would estimate the number of rows read to be as follows:
((Rows in Flights) * (Rows in OtherTable))/250