apache > db
Apache DB Project
 
Font size:      

Estimations of Rows Scanned From Disk for an Index Scan

Estimations of Rows Scanned From Disk for an Index Scan

When an index is available, the optimizer has to estimate the number of rows that will be scanned from disk. The accuracy of this estimate depends on the type of query being optimized.

Queries with a Known Search Condition

When the exact start and stop conditions are known at compilation time, the optimizer uses the index itself to make a very precise estimate of the number of rows that will be scanned from disk. An example of a query with a known search condition:

SELECT *
FROM Flights
WHERE orig_airport = 'SFO'

The search value, 'SFO', is known. The optimizer will be able to make an accurate estimate of the cost of using the index orig_index.

In addition, if the index is unique, and the WHERE clause involves an = or IS NULL comparison to all the columns in the index, the optimizer knows that only a single row will be scanned from disk. For example:

-- there's a unique key on city_id
SELECT * FROM Cities WHERE city_id = 1

Queries with an Unknown Search Condition

Queries sometimes have an unknown search condition, such as in the case when the statement's WHERE clause involves dynamic parameters that are known only at execution time and not at compilation time, or when the statement involves a join. For example:

-- dynamic parameters 
 
SELECT *
FROM Flights
WHERE orig_airport = ?
 
rollback
 -- joins 
SELECT * FROM Countries, Cities
WHERE Countries.country_ISO_code = Cities.country_ISO_code
 -- complex search conditions 
SELECT * FROM Countries
WHERE region = (select region from Countries where country = 'Spain')

In the above SELECT statements, the optimizer cannot get enough useful information from the index about how many rows will be returned by a particular access path. However, it can often make a good guess by looking at a table's selectivity for a particular WHERE clause.

Selectivity refers to the fraction of rows that will be returned from the table for the particular WHERE clause. The optimizer multiplies the number of rows in the table by the selectivity for a particular operation. For example, if the selectivity for a particular search operation is .10, and the table contains 100 rows, the optimizer estimates that the operation will return 10 rows. (This is not exact; it is just a good guess.)


Previous Page
Next Page
Table of Contents
Index