SQLBase Advanced Topics Guide : Overview of the SQLBase Query Optimizer

Overview of the SQLBase Query Optimizer
In this chapter we:
In the next three chapters we examine the query optimizer contained within SQLBase. The SQLBase query optimizer (often simply called the optimizer) is a specific implementation of a cost-based optimizer, meaning its decisions on access path selections are based on the estimates of potential costs associated with executing a particular access plan.
The basis of these cost estimates are the statistics contained within the SQLBase catalog and control areas within the database. The advantages of this approach are that the access plan decisions made by SQLBase can be made with very recent information concerning the actual database contents. Just how recent and accurate this information is depends on the procedures you follow in administering your databases.
This chapter reviews the basic relational operators implemented by SQL followed by an overview of the techniques that SQLBase may use when actually executing these operations. This set of techniques, called physical operators, is the set of actions the optimizer considers when building access plans.
While we summarize the actions taken to perform each physical operator, we will also briefly summarize the major factors comprising the cost of the operator (which is usually I/O). Finally, the structure of an access plan is explained, with an emphasis on the heuristics used by the optimizer when building a set of access plans on which to perform cost analysis.
Throughout the next three chapters, examples that illustrate optimizer operations generally use the SQL SELECT statement. You should keep in mind, however, that all SQL Data Manipulation Statements (DML) are processed by the optimizer. This includes not just the SELECT statement, but the INSERT, UPDATE, and DELETE statements as well.
The steps involved in optimizing these other statements are virtually identical to those of the SELECT, the only difference being operations used to perform locking and logging during update operations. These exceptions are covered in Concurrency Control and Logging and Recovery chapters. The benefits of using the SELECT statement for examples is that the SELECT offers many more relational operations, such as a join, which exercise a much wider variety of optimizer processes than other DML statements.
A secondary advantage of using SELECT statements for examples is that the execution of a SELECT statement creates a result set which visually shows the final output of SQLBase’s processing. This contrasts to the other DML statements, which change the state of the database, but these changes are not visible externally except through the return of error codes or other environmental variables. Only a subsequent SELECT statement will reveal the changes which were made to the contents of the database by the execution of one of these DML commands.
Relational operations
The basic operators of the relational algebra were defined by E. F. Codd when he published his landmark paper delineating the relational model access language in 1972. The operations defined in this relational algebra form the logical operators that must be processed by the optimizer. These logical operators are implicit in the syntax of SQL DML statements submitted to SQLBase for processing.
The following relational algebra operations have one or more relations as inputs, and one relation as output. A relation is simply a table of some degree n, where n is the number of attributes (or columns) in the table. These operations are broken down into two classes: traditional set operations and special relational operators. While the traditional set operations are sufficient to process most multiple relation queries, the special relational operators specify the desired results more efficiently. This last class of operators is the basis of the SQL language, and is examined in more detail throughout the remainder of this chapter. The traditional set operations are defined here to provide background information.
An important distinction of the traditional set operators is that (with the exception of the Cartesian product) the relations used as input to these operations must be of the same number of columns, with each matching attribute defined on the same domain. This means that each table must have the same number of columns, and that each column pair of any position (for example, column 1 of each table) must be defined with the exact same data type and scale.
The union of two relations is all rows belong to either or both relations. It can be formed by appending one table to the other and eliminating duplicate rows.
The intersection of two relations consists of the rows which appear in both relations. Any row which appears in only one table, but not the other, is not a member of the intersection set.
The difference between two relations is all rows which belong to the first table but not to the second. Note that this operation is not commutative like the other traditional set operators, meaning that .
The Cartesian product of two relations is the table resulting from concatenating each row of one table with every row of the other table. The table resulting from this operation contains as many rows as the product of the rows in the input tables. This means that the Cartesian product of a 15 row table and a 50 row table contains 750 rows. As mentioned previously, this is the only traditional set operation where the format of the two tables can differ.
SQL generally uses these operations more than the traditional set operations:
The projection operation limits the columns of a table that are referenced in a SQL statement.
A select limits the result set to only those rows that meet the qualifications. The selection criteria compares one or more columns of the table to either one or a set of constants or expressions.
The join operation creates a result set by concatenating rows together from multiple tables. These concatenated rows must also meet the specified join criteria, which is a comparison between one or more columns of the tables involved. These comparisons differ from the select operation in that they compare columns of different tables.
Because of the central role played by these special relational operators in SQL statements, we will now examine them in more detail.
When a SELECT references specific columns of one or more tables, only those columns are returned:
The alternative to performing a projection is to use the asterisk (*) wild card notation which causes all columns to be included:
Note that many sites have standards prohibiting this notation from being used in code, since changing the number of columns in the table can have an adverse affect on the program containing the statement.
The select is the row equivalent of the column projection operation. In SQL, the WHERE clause specifies a selection by naming columns in comparison expressions with either constants, that consist of one or a set of values, or expressions that evaluate to a single value or set of values. The following are examples of the select operation:
(TOTAL_AMT > 1000);
Each of the comparisons contained within the WHERE clause is called a predicate. Some SQL statements, such as the last example shown here, contain more than one predicate. When the operation specified in a predicate is performed on a row, it is called applying the predicate. If the predicate evaluates to TRUE, it is said to be satisfied, otherwise it is not satisfied. When a statement has more than one predicate, they must be connected by one of the Boolean operators AND or OR. When all of the predicates of a statement are connected by AND, they are said to be AND-connected, or conjunctive. When they are all connected by OR, they are OR-connected, or disjunctive. This terminology of predicates plays an important role in how they are used by the query optimizer, as we shall see later.
Note that the form of the WHERE clause that compares columns of two different tables is not a select operation, but rather a specification for a join operation, which is covered next.
A join creates its result set from two or more tables in a similar manner as the Cartesian product mentioned earlier; it concatenates each row of one table with every row of the other table. The difference between the Cartesian product and the join is that only rows meeting some specified join criteria are eligible for inclusion in the result set. This is logically equivalent to building the Cartesian product, then performing a select operation against it. However, the join allows for a more efficient operation with most query optimizers because they can eliminate rows from the result set without first forming the entire Cartesian product as an intermediate result.
To perform a join, use the SQL FROM and WHERE clauses. The FROM clause must name each input table to the join. The WHERE clause specifies the join criteria, by comparing one or more columns of one table with columns of another table. The predicates used as join criteria determine how many rows of the Cartesian product will be bypassed in processing the join. The more restrictive the join criteria are, the more efficient the join will be when compared to the Cartesian product (note that the Cartesian product is formed by not specifying any join criteria at all). Here are some examples of joins:
In the second example, the ‘C’, ‘O’, and ‘P’ letters allow you to qualify column names with their correct tables without having to spell out the entire table name. In SQL, these are called correlation variables. Since most column names need to be qualified they are often used in joins due to the presence of multiple table names in the FROM clause.
Two important characteristics of the join operation are that it is both commutative and associative in nature:
A JOIN B = B JOIN A (commutative)
A JOIN (B JOIN C) = (A JOIN B) JOIN C (associative)
(A JOIN B) JOIN C = B JOIN (A JOIN C) (both)
The effect of these properties is that when processing a join of more than two tables, the optimizer may select any sequence of two table joins to complete the operation. This allows the optimizer to treat a join of any number of tables as a series of two table joins, which avoids having to have special physical operations available to join any specific number of tables.
The type of comparison performed is often used to further refine the name of the join. These sub-classifications of the join are briefly covered below.
The most common example of the join is the equijoin, which uses the ‘=’ operator. This version of the join operation is used whenever business relationships are used to combine information from two tables. These joins occur by setting a primary key in the parent table equal to the foreign key in the child table.
Natural join
This is a particular form of equijoin where all columns of the two tables are included, with the exception of one set of the key columns, which are duplicated across the tables. This is a full equijoin between two tables without any projection except for the elimination of the duplication of the key columns.
The semijoin operation is equivalent to an equijoin with a subsequent projection that only includes the columns of one of the join tables. This is useful when rows of a table are needed that meet a selection criteria that is defined in terms of membership within another table’s foreign key column. An example would be the set of all products which have been ordered during January of 1993:
(O.ORD_DATE BETWEEN JAN-1-1993 AND JAN-31-1993);
The :equijoin is also useful in a distributed environment when the two tables to be joined are at different nodes on the network.
The outerjoin allows the non-participating rows of one of the joined tables to be included in the result set. The reason these rows are termed non-participating is because they contain key values which are not referenced by any rows of the other table. For example, a row in the product set which contains a product number that has never been ordered by any customer would be a non-participating row in a join of the product table with the order table. In SQLBase, these rows could be included in the result set anyway through the specification of the outerjoin operator, ‘+’, on the order table key as in the following example:
A selfjoin is an equijoin of a table with itself. This is also called a ‘recursive join’. Note that in this case, you must assign correlation variables in order to avoid ambiguous column references, as in the following example which lists the names of all employees and their assigned managers:
While aggregation was not originally specified as a relational operation, its inclusion as a standardized capability in SQL served to make it a commonplace operation. The purpose of aggregation is to represent a table of information through some derived statistic, such as the sum or mean (average) of a set of numbers.
SQL supports two types of aggregation, the simplest is scalar aggregation, which derives a single output from a column of a relation. Scalar aggregation in SQL is performed through the inclusion of an aggregate function in a query that does not have any GROUP BY clause. Examples of scalar aggregation include:
The first query produces the total salary paid to all employees, while the second query tells how many employees are named ‘DAVE’. The distinguishing characteristic of these queries is the presence of some aggregate function in the SELECT list which is allowed to range over the entire result set of the table access (after performing the project and select operations).
The second type of aggregation supported by SQL is function aggregation. This type of aggregation uses the same aggregate functions as scalar aggregation. The difference between the two is that function aggregation always creates another table as output. This contrasts with scalar aggregation, which simply returns a single value as output.
The SQL syntax of a statement performing function aggregation always includes the GROUP BY clause. The columns named in the GROUP BY clause become, in effect, the key to the new table that is output from the statement. Examples of function aggregation are:
Here the first query lists the department and average salary for each department, while the second lists the birth month and number of employee birthdays in each month throughout the year. In each of these statements, the aggregation function only ranges over the rows of the input table that have equal values in the columns specified in the GROUP BY clause.
SQLBase physical operators
When SQLBase executes a query plan, it reads the plan and performs each specified step in turn. Each of these steps tells SQLBase what operation to perform at that step, and what inputs and outputs are required. When executing query plans, SQLBase uses physical operators. These are different than logical operators such as SQL statements which specify relational operations that must be performed. For every possibly logical operator, there is at least one, and possible many, physical operators that allow SQLBase to execute the operation in an efficient manner.
Every physical operator has either one or two inputs, depending on the nature of the operation. It also has one output table (which of course may consist of only a single row, or even no rows). This output table may be a result set which is the final output of the query, or it may be an intermediate table which will be used as an input table to an operation later in the query plan. One query plan can contain many of these intermediate tables. They can be stored within SQLBase’s cache memory, or on disk in a temporary file. When a step in the query plan is finished executing, the temporary input tables it uses are either freed from memory or deleted from disk.
The sort operation is performed whenever a table must be placed into a specific sequence according to one or more columns, which are collectively called the sort key. This physical operator has one table as input, and one table as output. The basic technique SQLBase uses for sorting is the postman sort, which is a radix sort.
The aggregation operator also has one table as input, and one table as output. The SQLBase physical aggregation operator performs a serial scan of the qualifying data rows, with the aggregate function calculation being performed for each row. When SQLBase performs simple scalar aggregation, the input rows can be in any sequence. However, when aggregate functions are being evaluated through SQL statements containing GROUP BY clauses, the input table to the aggregation operation must be in sequence by the columns specified in the GROUP BY. This prerequisite is satisfied by the query optimizer through the placement of either a sort operator prior to the aggregation, or through the use of a table access technique which returns table rows in a specific sequence (discussed later in this chapter).
Disk access operation
This group of physical operators is responsible for returning rows from a single table. These rows can then either be processed by other steps in the query plan, or constitute the final result set, depending on the SQL statement being processed. Disk access operators are entirely responsible for the access of data from disk, however. Therefore, one of these operators always appears as the first step of a query plan, in order to perform the access of data to whatever table the optimizer decided should be processed first.
Table scan
This is the simplest technique for accessing a base table. Each data page associated with the table is read in succession. For each page, the rows on that page are extracted for processing. Note that this extraction may require accessing additional pages in the form of either extent pages or overflow pages (in the case of hashed tables). In the absence of any extent or overflow pages, the number of disk accesses required for a table scan is equal to the number of data pages assigned to the table, including pages required for any LONG VARCHAR columns that may be specified.
Index leaf scan
This access technique is used whenever a specific sequence is desired for the rows of a table, and an index matches this sequence, but there are no predicates specified for the table that are present in the index. In other words, this access technique can substitute for a sort if the index sequence matches the desired table sequence.
The basis for this technique is the B+Tree modification to the basic B-Tree, which links together the leaf pages of an index. This list of linked leaf pages is called the sequence set of the index. It is traversed in key sequence, with each index node’s rows accessed as the node is processed. This means that the worst case I/O count is the number of leaf pages in the index plus the number of rows in the table. This is sometimes improved on, however, by the table being either fully or partially clustered in the index sequence. We will see how the optimizer accounts for this possibility in the next chapter.
Matching index scan
The matching index scan uses the full capabilities of SQLBase’s B+Tree index structures. This technique is useful whenever a SQL statement only requires a portion of a table to be processed, based on predicates that match the columns present in the index. Since the generic capabilities of B+Tree indexes are exploited by SQLBase, this access technique can use any index which has predicate columns in the leftmost key positions. The optimizer generally uses this technique if the predicates are restrictive enough to make the extra I/Os for the index worthwhile. Also, if an inequality predicate is used for some index column, this technique uses the index tree to find the starting leaf, and then scans the sequence set forward or backward from that point.
The number of I/Os incurred by this access technique is one for each level in the index, plus one for each index node that is accessed, plus one for every row that must be retrieved. The optimizer estimates costs associated with this access technique using an index statistic called the selectivity factor, which is explained in detail in the next chapter. Briefly, the selectivity factor describes the expected number of rows in the table that would satisfy a predicate.
Hash access
A table which has a clustered hashed index placed on it may be accessed via this index whenever all of the columns contained within the index are used in equality predicates in an SQL statement. As explained in Chapter 10, Hashed Indexes, the hashing technique lacks any generic or scanning capabilities. This is why the hashed index key must be fully specified with the equality predicate in the SQL statement.
Whenever the restrictive conditions placed on the use of the hash access technique are met, this is probably the quickest possible alternative for table access and will likely be selected by the optimizer. The I/O cost of this technique is one I/O for each row, or set of rows, which has the key specified in the predicates. Extra I/O costs may be incurred for searching a chain of overflow pages if the hashed table has experienced a large number of collisions, and this condition may cause the optimizer to reject the hash access technique.
Join operations
The relational join is one of the most frequently executed operations of any relational database. It can also be extremely time consuming due to its data intensive nature. This is because the most simplistic implementation of a join requires every row of the one table to be compared to every row of another table. In order to execute joins as quickly and efficiently as possible, SQLBase selects from among a number of alternative techniques. All of these operators accept two tables as input, and produce a single table as output. These input and output tables can be any combination of database tables, temporary tables, or final result sets.
Terminology that is common to all join algorithms is the naming of the two tables involved. Since one row of a table must be compared to all rows of another table, the table for which a row is held is called the outer table. The table which is scanned while a row is held from the outer table is called the inner table. These terms first originated with the loop join algorithm, where they are especially appropriate, as we shall see next.
Simple loop join
This is the most straightforward method for joining two tables, and can be used for any join criteria. It can also be the worst performing join operation, especially if both input tables are large. One advantage of the simple loop join is that it can be used in any case, regardless of the type of join criteria specified or the absence of index structures. Also, there are cases where the simple loop join can be the best performing technique, particularly if one of the input tables can fit entirely into cache memory.
The loop join algorithm obtains a row from the outer table, then scans the entire inner table to find the rows which meet the join criteria. Whenever the criteria are met, the rows are appended and output. When the end of the inner table is reached, a new row is obtained from the outer table, and the scan of the inner table is started again.
Due to the nested loop structure of this technique, the worst case for the number of I/O operations is the number of data pages in the inner table multiplied by the number of data pages in the outer table. This can be improved on dramatically if the inner table is small enough to fit into cache memory, when the number of I/Os drops to the sum of the data pages in the two tables. For this reason, SQLBase always chooses the smaller of the two tables to be the inner table when cache is sufficient to hold the table, at which point this join technique may become less costly than other alternatives.
The following two alternative access methods are enhancements to the basic loop join. They could be used whenever a suitable index is present, in which case they can speed up the join process greatly over the simple loop join.
Loop join with index
This variation on the nested loop join can be used when there is an index on one of the tables that has the join columns in the high order positions of the key. When this is the case, the optimizer uses the table with the index as inner table of the loop algorithm, and uses the index to limit the search for rows matching the current row being held from the outer table. This greatly decreases the number of I/Os needed to process the inner table. The best case is when the inner table is clustered, and an equijoin is being performed, in which case the number I/Os is equal to the sum of the number of data pages in the two tables, plus the height of the index tree, plus the number of rows in the outer table. The last two cost components account for the overhead of accessing the index structure when joining row occurrences.
Loop join with hash index
This enhancement of the loop join technique can be applied when one of the tables to be joined has a hash index placed on it. The other two prerequisites for this technique are that the hash index must be placed on all of the columns of the join criteria (and no others), and the join must be an equijoin. When these conditions are met, this join technique can be very efficient. The worst case is when the number of I/Os is equal to the number of data pages in the outer table plus the number of rows in the inner table. This only occurs if every row in the inner table is referenced by the outer table. When not all rows of the inner table are referenced in the outer table, there is a reduction in the number of I/Os.
Index merge join
This technique can be used when there are indexes present on the join criteria for both tables. The basic algorithm is to scan the sequence set of the two indexes, and merge rows from the table based on their join criteria being satisfied. For instance, when processing an equijoin, first an entry is read from the sequence set of the outer table. Then, an entry is read from the inner table’s sequence set. If the inner table’s join column is less than the outer table, the scan of the inner table’s sequence set continues. If the inner table’s join column is greater than the outer table’s, the outer table’s sequence set is scanned until either equal or greater values are found. When the join columns in the two indexes satisfy the join criteria, the rows of the tables are read from the data pages, appended to each other, and written to the output table.
The I/O cost of this technique depends on how many rows of the tables meet the join criteria. The best case is on an equijoin of primary keys, when the cost are equal to the number of leaf nodes in the two sequence sets, plus the number of rows in the two tables. If the tables are also clustered by the sequence sets, the number of I/Os to fetch rows is reduced to the number of data pages allocated to the two tables plus the sum of leaf pages on the two indexes.
Hash join
This technique can be used for equijoins only, and does not require any indexes on the join criteria for the two tables. The algorithm first performs the setup phase, when it scans the outer table and places each row into a hash table at a location determined by applying a hash function to the join criteria. The smaller table is always chosen as the outer table, in order to minimize the memory demands of the hash table.
In the second phase, called the probe, the inner table is scanned and the same hash function used in the setup is applied to the join columns in order to locate a possible match. Any rows found in the hash bucket from the first table are then examined. The rows that satisfy the join criteria are appended to the inner table row and written to the output table.
This can be a very rapid join technique, especially when the smaller table can be hashed into a table fitting in memory. When this is the case, the I/O required is the sum of the data pages of the two tables. Otherwise, additional I/O is required to partition the hash table into segments, and manage those segments in a temporary file.
Imbedded operators
The two most common relational operators, select and project, are implemented within the other physical operators. This allows the optimizer to perform these operations at the earliest possible time in an access plan, which immediately reduces the amount of data that the remainder of the plan must handle. The implementation of the project operator is further enhanced by performing index-only access when appropriate, as explained in the following paragraphs.
The select operation restricts access to rows that do not satisfy the predicate logic of the query. These predicates are evaluated as early as possible when executing the query, which is usually the first time the row containing the column named in the predicate is accessed. Each file access physical operator in SQLBase accepts predicates to perform this row filtering when accessing a table. If the predicates in the query are conjunctive, or and-connected, a form of logic called short-circuit testing may be performed. This technique allows the row to be discarded as soon as the first unsatisfied predicate is found.
When the SQLBase optimizer estimates the cost of a query, the effect of the predicates used for the select operation is taken into account through a statistic called the selectivity factor. This figure represents the proportion of rows in a table that would be expected to satisfy a particular predicate. The optimizer makes its estimate of selectivity factor based on statistics associated with the table’s indexes and certain heuristics concerning the predicate’s comparison operator. We will examine this process in more detail in the next chapter.
The project operation is implemented within other physical operators by passing the list of columns to be retrieved to any file access operator. That operator then only includes those columns in its output table. Performing the project at the earliest possible time in the query plan reduces the width of intermediate tables and result sets, thereby saving space both in memory and temporary disk files. Note that one side effect of performing the project at page access time is a possible reduction in I/Os due to avoiding access to extent or LONG VARCHAR pages which do not contain needed columns.
Index-only table access
Another benefit of building the project operation into the other physical operators is the ability to perform index-only access. This occurs whenever the columns that are specified for projection in the query are all present in one of the table’s indexes. That index must also perform whatever predicate logic is required for the select operation. When this happens, SQLBase’s optimizer realizes that the queries data requirements can be satisfied without access to the table’s data pages. This can result in significant I/O savings, depending on the number of rows which satisfy the predicates of the query.
Query plan structure
When the optimizer groups a set of physical operators together to execute a SQL statement, the resulting specification is called a query plan.
An analogy to the programming task makes the role of the query plan easier to understand. A programmer codes instructions in some high level computer language, such as C, which is then compiled to produce an executable form of the programmer’s code. This executable form uses machine instructions to talk to the CPU that runs the program. When the CPU executes the program, it processes machine instructions rather than the programmer’s original code.
Similarly, a programmer codes SQL statements, which are then processed by the optimizer to produce the executable form of the programmer’s SQL, which is the query plan. When SQLBase executes the SQL statement, it actually processes the query plan, not the original SQL. Just as the C code has been translated into a more rapidly executable form by the C compiler, the SQL statements are converted into a form which the SQLBase engine can process quickly and efficiently.
Query trees
One way to view a query that provides for easier understanding is called the query tree. This is a tree structure, where the final result of the query is at the top of the tree (the root), and the database tables involved in the query are at the leaves. The intermediate nodes of the tree, between the leaves and the root, represent physical operations that are performed during the query execution. When the query plan is executed, the sequence of node traversal is from bottom to top, and from left to right. This is the sequence in which the operations are printed when the ‘SET PLANONLY’ option of SQLTalk is executed to print out a query plan for a given SQL statement.
The following example shows a query tree for a SQL statement that gets the names of authors who published books in 1993:
Building query trees
When the optimizer builds query trees for cost estimation and possible final selection, it uses some simple rules that determine the placement of various nodes in the tree. These rules allow the optimizer to limit the number of query trees that are built for further consideration by the cost estimation task. By limiting the possible access plan alternatives in this way, the optimization process can execute more efficiently.
One of the important heuristics is to apply restrictions such as the select and project operators as early as possible. This is because select and project operations can reduce the size of the result table. In figure B, the optimizer improves the query tree of Figure A by applying a select first to reduce the number of rows that are input to the equijoin operation.
Furthermore, when multiple select operations are performed in a query against a table, the ones with the lowest selectivity factor are performed lower in the tree, in order to reduce the size of the intermediate results as rapidly as possible. Of course, when predicates can be evaluated depends on when all the data referenced by the predicate is available. SQLBase converts some predicates in order to evaluate them as rapidly as possible, as we shall now see.
Predicate logic conversion
Whenever multiple predicates are specified using the relational select operation in a query, an important heuristic is to convert them into an equivalent condition in conjunctive form, which is a regrouping created by ANDing a set of predicates where each predicate contains only OR operators. For instance, the clause:
WHERE COL1>5 OR (COL2<500 AND COL3>150)
can be rewritten as:
WHERE (COL1>5 OR COL2<500) AND (COL1>5 OR COL3>150)
The conjunctive form is preferable since it can be evaluated with short-circuit techniques. A set of predicates in conjunctive form evaluates to TRUE only if every OR grouping evaluates to TRUE. It evaluates to FALSE if any grouping evaluates to FALSE. In the short-circuit technique, if the result of evaluating any single OR grouping is FALSE, there is no need to evaluate other conditions at all. Since the number of comparison operations that are performed in a query have a direct impact on the amount of CPU time needed, the short-circuit evaluation of predicates in conjunctive form can save CPU cycles.
Advanced Topics Guide

Unify Inc.