60.1. Query Handling as a Complex Optimization Problem | ||||
---|---|---|---|---|
Prev | Up | Chapter 60. Genetic Query Optimizer | Home | Next |
60.1. Query Handling as a Complex Optimization Problem#
Among all relational operators the most difficult one to process and optimize is thejoin. The number of possible query plans grows exponentially with the number of joins in the query. Further optimization effort is caused by the support of a variety ofjoin methods (e.g., nested loop, hash join, merge join inPostgreSQL) to process individual joins and a diversity ofindexes (e.g., B-tree, hash, GiST and GIN inPostgreSQL) as access paths for relations.
The normalPostgreSQL query optimizer performs anear-exhaustive search over the space of alternative strategies. This algorithm, first introduced in IBM's System R database, produces a near-optimal join order, but can take an enormous amount of time and memory space when the number of joins in the query grows large. This makes the ordinaryPostgreSQL query optimizer inappropriate for queries that join a large number of tables.
The Institute of Automatic Control at the University of Mining and Technology, in Freiberg, Germany, encountered some problems when it wanted to usePostgreSQL as the backend for a decision support knowledge based system for the maintenance of an electrical power grid. The DBMS needed to handle large join queries for the inference machine of the knowledge based system. The number of joins in these queries made using the normal query optimizer infeasible.
In the following we describe the implementation of agenetic algorithm to solve the join ordering problem in a manner that is efficient for queries involving large numbers of joins.