What is an index?

An index is a database structure that provides quick lookup of data in a column or columns of a table.

For example, a Flights table in a travelDB database has three indexes:

This means there are three separate structures that provide shortcuts into the Flights table. Let's look at one of those structures, OrigIndex.

OrigIndex stores every value in the orig_airport column, plus information on how to retrieve the entire corresponding row for each value.

For every row in Flights, there is an entry in OrigIndex that includes the value of the orig_airport column and the address of the row itself. The entries are stored in ascending order by the orig_airport values.

When an index includes more than one column, the first column is the main one by which the entries are ordered. For example, the index on (flight_id, segment_number) is ordered first by flight_id. If there is more than one flight_id of the same value, those entries are then ordered by segment_number. An excerpt from the entries in the index might look like this:

'AA1111' 1
'AA1111' 2
'AA1112' 1
'AA1113' 1
'AA1113' 2

Indexes are only sometimes helpful. This particular index is useful when a statement's WHERE clause is looking for rows for which the value of orig_airport is some specific value or range of values. SELECTs, UPDATEs, and DELETEs can all have WHERE clauses.

For example, OrigIndex is helpful for statements such as the following:

SELECT *
FROM Flights
WHERE orig_airport = 'SFO'

SELECT *
FROM Flights
WHERE orig_airport < 'BBB'

SELECT *
FROM Flights
WHERE orig_airport >= 'MMM'

DestIndex is helpful for statements such as the following:

SELECT *
FROM Flights
WHERE dest_airport = 'SCL'

The primary key index (on flight_id and segment_number) is helpful for statements such as the following:

SELECT *
FROM Flights
WHERE flight_id = 'AA1111'

SELECT *
FROM Flights
WHERE flight_id BETWEEN 'AA1111' AND 'AA1115'

SELECT *
FROM FlightAvailability AS fa, Flights AS fts
WHERE flight_date > CURRENT_DATE
AND fts.flight_id = fa.flight_id
AND fts.segment_number = fa.segment_number

What's optimizable? discusses why the indexes are helpful for these statements but not for others.

Related concepts
What's optimizable?
Covering indexes
Useful indexes can use qualifiers
When a table scan is better
Indexes have a cost for inserts, updates, and deletes