apache > db
Apache DB Project
 
Font size:      

Analyzing the Information

Analyzing the Information

Statistics Timing

If you are using statistics timing, RUNTIMESTATISTICS provides information about how long each stage of the statement took. An SQL statement has two basic stages within Derby: compilation and execution. Compilation is the work done while the statement is prepared. Compilation is composed of the following stages: parsing, binding, optimization, and code generation. Execution is the actual evaluation of the statement.

Statement Execution Plan

RUNTIMESTATISTICS also provides information about the statement execution plan. The statement execution plan shows how long each node took to evaluate, how many rows were retrieved, whether an index was used, and so on. If an index was used, it shows the start and stop positions for the matching index scan. Looking at the plan can help you determine whether to add an index or to rewrite the query.

A statement execution plan is composed of a tree of result set nodes. A result set node represents the evaluation of one portion of the statement; it returns rows to a calling (or parent) node and can receive rows from a child node. A node can have one or more children. Starting from the top, if a node has children, it requests rows from the children. Usually only the execution plans of DML statements (queries, inserts, updates, and deletes, not dictionary object creation) are composed of more than one node.

For example, consider the following query:

SELECT * FROM Countries

This simple query involves one node only--reading all the data out of the Countries table. It involves a single node with no children. This result set node is called a Table Scan ResultSet. RUNTIMESTATISTICS text for this node looks something like this:

Statement Name:
        null
Statement Text:
        select * from countries
Parse Time: 20
Bind Time: 10
Optimize Time: 50
Generate Time: 20
Compile Time: 100
Execute Time: 10
Begin Compilation Timestamp : 2004-05-25 09:16:21.24
End Compilation Timestamp : 2004-05-25 09:16:21.34
Begin Execution Timestamp : 2004-05-25 09:16:21.35
End Execution Timestamp : 2004-05-25 09:16:21.4
Statement Execution Plan Text:
Table Scan ResultSet for COUNTRIES at read committed isolation
level using instntaneous share row 
locking chosen by the optimizer
Number of opens = 1
Rows seen = 114
Rows filtered = 0
Fetch Size = 16
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 10
        close time (milliseconds) = 0
        next time in milliseconds/row = 0
 
scan information:
        Bit set of columns fetched=All
        Number of columns fetched=3
        Number of pages visited=3
        Number of rows qualified=114
        Number of rows visited=114
        Scan type=heap
        start position:
null    stop position:
null    qualifiers:
None
        optimizer estimated row count:          119.00
        optimizer estimated cost:           69.35
 

Consider this second, more complex query:

 SELECT Country
FROM Countries
WHERE Region = 'Central America'

When executed, this query involves two nodes--one to retrieve qualifying rows (the restriction is done at this node) and one to project the requested columns. So, at bottom, there is a TableScanResultSet for scanning the table. The qualifier (Region = 'Central America') is evaluated in this node. These data are passed up to the parent node, called a Project-Restrict ResultSet, in which the rows are projected--only the country column is needed (the first column in the table). RUNTIMESTATISTICS text for these two nodes looks something like this:

Statement Name:
        null
Statement Text:
        SELECT Country FROM Countries WHERE Region = 'Central America'
Parse Time: 10
Bind Time: 0
Optimize Time: 370
Generate Time: 10
Compile Time: 390
Execute Time: 0
Begin Compilation Timestamp : 2004-05-25 09:20:41.274
End Compilation Timestamp : 2004-05-25 09:20:41.664
Begin Execution Timestamp : 2004-05-25 09:20:41.674
End Execution Timestamp : 2004-05-25 09:20:41.674
Statement Execution Plan Text:
Project-Restrict ResultSet (2):
Number of opens = 1
Rows seen = 6
Rows filtered = 0
restriction = false
projection = true
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        restriction time (milliseconds) = 0
        projection time (milliseconds) = 0
        optimizer estimated row count:           11.90
        optimizer estimated cost:           69.35
 
Source result set:
        Table Scan ResultSet for COUNTRIES at read committed isolation level
using instantaneous share row 
locking chosen by the optimizer
        Number of opens = 1
        Rows seen = 6
        Rows filtered = 0
        Fetch Size = 16
                constructor time (milliseconds) = 0
                open time (milliseconds) = 10
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                next time in milliseconds/row = 0
 
        scan information:
                Bit set of columns fetched={0, 2}
                Number of columns fetched=2
                Number of pages visited=3
                Number of rows qualified=6
                Number of rows visited=114
                Scan type=heap
                start position:
null            stop position:
null            qualifiers:
Column[0][0] Id: 2
Operator: =
Ordered nulls: false
Unknown return value: false
Negate comparison result: false
 
                optimizer estimated row count:           11.90
                optimizer estimated cost:           69.35
 

Other, more complex queries such as joins and unions have other types of result set nodes.

For inserts, updates, and deletes, rows flow out of the top, where they are inserted, updated, or deleted. For selects (queries), rows flow out of the top into a result set that is returned to the user.

The Derby Reference Manual shows the many possible ResultSet nodes that might appear in an execution plan.

In addition, read "DML statements and performance", for more information about some of the ways in which Derby executes statements.

Optimizer estimates

RUNTIMESTATISTICS show the optimizer estimates for a particular node. They show the optimizer's estimated row count and the optimizer's "estimated cost."

The estimated row count is the query optimizer's estimate of the number of qualifying rows for the table or index for the entire life of the query. If the table is the inner table of a join, the estimated row count will be for all the scans of the table, not just for a single scan of the table.

The estimated cost consists of a number, which is a relative number; it does not correspond directly to any time estimate. It is not, for example, the number of milliseconds or rows. Instead, the optimizer constructs this number for each possible access path. It compares the numbers and chooses the access path with the smallest number.


Previous Page
Next Page
Table of Contents
Index