Tuning Oracle SQL


Tune SQL based on the configuration of your system, ie. OLTP or read-only Data Warehouse. OLTP applications have high volumes of concurrent transactions and are better served with sequential SQL. Read-only data warehouses require rapid access to large amounts of information at once and thus parallelism is more appropriate in data warehouse applications. In general serial SQL will minimise resource usage and parallel SQL allows rapid access of large amounts of data at once.

With sequential SQL the EXPLAIN PLAN command can be used to compare different versions of SQL statements. SQL statement costs appear in the POSITION column in the first row from EXPLAIN PLAN. Parallel execution requires processing of large numbers of rows at once. Use parallelism when required to process many rows simultaneously. When tuning OLTP applications utilise sharing of SQL code in PL/SQL procedures and do not use triggers unless absolutely neccessary. Note that triggers are not rules but only messages. Also triggers cannot be executed as a result of SQL statements causing more than one execution of each trigger.

The best approach to tuning of SQL statements is to seek out those statements consuming the greatest amount of resources (CPU, memory and I/O). Obviously the more utilised an SQL statement the more finely it should be tuned. Note that tuning is not neccessarily a never ending process but can be iterative. It is always best to take small steps and then assess improvements. Small changes are always more manageable and more easily implementable. Use tools such as TKPROF, tracing, Oracle Enterprise Manager tuning tools and packages and Oracle performance and tuning views to detect problem queries and tune them appropriately.

How to Tune SQL

Indexing

When building and restructuring of indexing never be afraid of removing unused indexes. The DBA should always be aware of where indexes are used and how. Taking an already constructued application makes this more complex. Pay most attention to indexes which are mostly utilised. Some small static tables may not require indexes at all. Small static lookup type tables which can be cached into MRU memory may be adversely affected by the addition of indexes. Sometimes table-scans are faster than anything else. Consider the use of clustering, hashing, bitmaps and even index organised tables. Oracle recommends the profligate use of function-based indexes, assuming of course there will not be too many of them. Do not allow too many programmers to create their indexes. Applications developers tend to be unaware of what other developers are doing and create indexes specific to a particular requirement which may be used in only one place. Some DBA control and approval process must be maintained on the creation of new indexes. Remember, every table change requires a simultaneous update to all indexes created based on that table.

SQL Statement Reorganisation

SQL statement reorganisation such as WHERE clause order matching indexes, hints, simplistic SQL statements, etc. Nested SQL statements can be effective under certain circumstances. However, nesting of SQL statements increases the level of complexity and if possible looping cursors can be utilised in PL/SQL procedures assuming the required SQL is no completely ad-hoc (Avoid ad-hoc SQL if possible - any functionality is always better provided at the application level). Nested SQL statements can simply become over complicated and impossible for even the most brilliant coder to tune to peak efficiency. If nested SQL statements must be used there are many factors to be considered. In the examples below note the difference between queries which utilise indexes and queries which do not. The objective is to access only the rows which are required in the result, not those rows which are not required. This is a very common mistake.

Use [NOT] EXISTS Instead of [NOT] IN

Note in the example below where the second SQL statement utilises the index in the subquery. Also note the use of EXISTS in the second query as opposed to IN. IN will build a set first and EXISTS will not. In other words IN will not utilise indexes whereas EXISTS will.

SELECT course_code, name
FROM student
WHERE course_code NOT IN
	(SELECT course_code FROM maths_dept);
SELECT course_code, name
FROM student
WHERE NOT EXISTS
	(SELECT course_code FROM maths_dept WHERE maths_dept.course_code = student.course_code);

In the example below the nesting of the two queries could be reversed depending on which table has more rows. Also if the index is not used or not available requires reversal of the subquery if tableB has significantly more rows than tableA.

DELETE FROM tableA WHERE NOT EXISTS 
	(SELECT columnB FROM tableB WHERE tableB.columnB = tableA.columnA);

Use of value lists with the IN clause could indicate a missing entity. Also that missing entity is probably static in nature and can potentially cached.

SELECT country FROM countries WHERE continent IN ('africa','europe','north america');

Equijoins and Column Value Transformations

AND and = predicates are the most efficient. Avoid transforming of column values in any form anywhere in a SQL statement, for instance ...

SELECT * FROM <table name> WHERE TO_NUMBER(BOX_NUMBER) = 94066;

And the example below is really bad !!! Typically indexes should not be placed on descriptive fields such as names. A function-based index would be perfect in this case.

SELECT * FROM table1, table2 WHERE UPPER(SUBSTR(table1.name,1,1)) = UPPER(SUBSTR(table2.name,1,1));

Transforming literal values is not such a problem but application of a function to a column within a table in a WHERE clause will use a table-scan regardless of the presence of indexes. However, function-based indexes can solve this problem. In fact, use function-based indexes whereever possible, the index is the result of the function which the optimiser will recognise and utilise for subsequent SQL statements.

Types

Try not to use mixed types by setting columns to appropriate types in the first place. If mixing of types must be utilised do not assume implicit type conversion because it will not always work. Function-based indexes can be used to get around type conversion problems but this is not the most appropriate use of function-based indexes. If types must be be mixed try to place type conversion onto explicit values and not columns. For intance,

WHERE zip = TO_NUMBER('94066')

as opposed to

WHERE TO_CHAR(zip) = '94066'

The DECODE Function

Note that the DECODE function will generally ignore indexes completely. DECODE is very useful in certain circumstances where nested looping cursors can become extremely complex. Use DECODE only when absolutely required and do not overuse it. DECODE is intended for specific requirements and is not intended to be used prolifically, especially not with respect to type conversions. Most SQL statements containing DECODE function usage can be altered to use explicit literal selection criteria perhaps using seperate SELECT statements combined with UNION clauses.

Join Orders

Always use indexes where possible, this applies to all tables accessed in a join, both those in the driving and nested queries. Also use indexes between parent and child nested subqueries in order to utilise indexes across a join. A common error is that of accessing a single row from the driving table using an index and then to access all rows from a nested query table where an index can be used in the nested query table based on the row retrieved by the driving table. Try to use indexes which fetch the minimum number of rows. The order in which tables are accessed in a query is very important. Generally a SQL statement is parsed from top-to-bottom and from left-to-right. Thus the further into the join or SQL statement the fewer rows should be accessed. Even consider constructing a SQL statement based on the largest table being the driving table even if that largest table is not the logical driver of the SQL statement. Note that when a join is executed that each join will overlay the result of the previous part of the join, effectively each section (based on each table) is executed seqentially. In the example below table1 has the most rows and table3 has the fewest rows.

SELECT * FROM table1, table2, table3 WHERE table1.index = table2.index AND table1.index = table3.index;

Hints

Use them !!!

Use INSERT, UPDATE and DELETE ... RETURNING

INSERT INTO table1 VALUES (test_id.nextval, 'Jim Smith', '100.12') RETURNING value, value * 10 INTO val1, val2;
UPDATE table1 SET name = 'Joe Soap' WHERE value < 200 RETURNING value * 10 INTO val1;
DELETE FROM table1 RETURN value INTO :array;

Triggers

Simplification or complete disabling and removal of triggers. Remember triggers wrre originally intended for messaging and are not intended for use as rules triggered as a result of a particular event. Other databases have full-fledged rule systems aiding in the construction of Rule-Based Expert systems such as Ingres, Oracle triggers are more like database events than event triggered rules causing other actions, potentially recursive, to occur. Never use triggers to validate referential integrity.

Data Restructuring

Data restructuring involves partitioning, normalisation and even denormalisation. Oracle recommends avoiding the use of primary and foreign keys for validation of referential integrity. This is not neccessarily the case. Referential integrity is centrally controlled and altered in a single place in the database. Placing referential integrity in application code is less efficient due to increased network traffic plus more code to be maintained in potentially many applications. All foreign keys must have indexes explicitly created and these indexes will often be used in general application SQL calls other than just for validation of referential integrity. Oracle does not create internal indexes when creating foreign reference keys. Foreign keys which are not indexed using the CREATE INDEX statement will cause table locks on the table containing the foreign key. There may be an advantage to creating foreign keys as bitmap indexes since prefix and suffix composite index contents are not required to be in specific orders. However, since referential integrity is generally useful for validation of normalisation constraints then referential integrity is only really applicable to OLTP or user-interactive applications; bitmap indexes are not consistent with constantly changing data. Also it is quite likely that foreign keys will often be used by the opitmizer in SQL statement filtering WHERE clauses. This is typically the case where database logical table structure is consistent with application requirements.

Do not use views as a basis for SQL statements which take a portion of the rows defined by that view. Views were originally intended for security. No matter what WHERE clause is applied to a view the entire view will always be executed first. On the same basis also avoid things such as SELECT * FROM ..., GROUP BY clauses and aggregations such as DISTINCT. DISTINCT will always select all rows first. Do not create new entities using joined views, it is better to create those intersection view joins as entities themselves; this applies particularly in the case of many-to-many relationships. Oracle data entity partitioning can be very useful when trying to reduce load.

Maintenance of Current Statistics

Maintain current statistics as often as possible, this can be automated, cost-based optimisation (uses statistics) is much more efficient than rule-based optimisation. Examine the DBMS_STATS package.