Optimising Relational Databases

There are a number of ways to optimise database access, through data selection or changes.

  1. Properly indexing tables is important. SQL statement where, order-by, group-by and having clauses should always match indexes or original physical insertion record order.

  2. The database engine is much faster than application code. Thus as much functionality as possible should reside in the database as compiled code, ie. stored procedures and database object methods. However, the trade-off in this case is complexity in the database. Object databases are much more effective at simplifying the burial of complexity into the database engine.

  3. Always use unbound front-end GUI controls. Never use front-end GUI bound data controls such as the on-screen data controls in Visual Basic. These items slow access to the database.

  4. Never put aggregation functions such as TRIM and AVERAGE into SQL statements. Speed is drastically affected. Aggregation functions within SQL statements are effectively placing an operation onto each record retrieved in a select statement; the intention of SQL is the retrieval of sets of records and not single records. Try to seperate these aggregation functions from SQL statements.

  5. Always try to make the primary key the unique index unless substituting the unique key with an automated counting or sequencing integer.

  6. Create indexes to match SQL select statements. However, note that all insert, update and delete statements require changes to all indexes aswell as the data table itself. This depends on how the table is used. Is the data static ? Is the data changed frequently ? Is the data mostly selected from ?

  7. Minimise on the number of joins between tables in a single SQL statement. It is sometimes better to create implicit looping cursors which select the many side of a one-to-many relationship as an iteration of the selection on the one side of the relationship.

  8. Avoid use of indexes on tables which are changed frequently. As already stated, every change to the database table requires a change to the data table plus all the index files attached to that table.

  9. Limit the amount of data returned by a query. Retrieve only the columns required; never use SELECT * FROM <table> if possible.

  10. It is in some ways better to have profligate amounts of high-speed hard-wired SQL code rather than more generic forms of SQL such as dynamically constructed SQL statements which are compiled as well as executed at run-time. Pre-compiled code will always execute much faster.

  11. Do not use pick-lists in applications unless they are very small. It is not easy for the user to have to scroll through over a hundred values, it usually irritates and frustrates.

  12. Avoid using sorts of data, ie. order-by clauses. If absolutely necessary try to sort using indexes. Also remember that the order-by clause is a subset of the where clause, ie. the order-by clause is executed on the subset of the data selected by the where clause. Therefore the order-by clause does not have to contain the elements in the where clause, only the items not in the where clause which are required to be sorted within the results of the where clause are required to be sorted by the order-by clause. There is no point in sorting what is all the same unless of course the where clause deliberately selects a range of records.

  13. Above all reduce the amount of data being worked with. The less data one manipulates or retrieves at once the more efficiency will be gained.