Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitc9ead90

Browse files
committed
Add the GEQO README file to the docs distribution
1 parent29138ee commitc9ead90

File tree

1 file changed

+160
-0
lines changed

1 file changed

+160
-0
lines changed

‎doc/README.GEQO

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
2+
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
3+
Genetic Query Optimization in Database Systems
4+
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
5+
6+
Martin Utesch
7+
8+
<utesch@aut.tu-freiberg.de>
9+
10+
Institute of Automatic Control
11+
University of Mining and Technology
12+
Freiberg, Germany
13+
14+
02/10/1997
15+
16+
17+
1.) Query Handling as a Complex Optimization Problem
18+
====================================================
19+
20+
Among all relational operators the most difficult one to process and
21+
optimize is the JOIN. The number of alternative plans to answer a query
22+
grows exponentially with the number of JOINs included in it. Further
23+
optimization effort is caused by the support of a variety of *JOIN
24+
methods* (e.g., nested loop, index scan, merge join in Postgres) to
25+
process individual JOINs and a diversity of *indices* (e.g., r-tree,
26+
b-tree, hash in Postgres) as access paths for relations.
27+
28+
The current Postgres optimizer implementation performs a *near-
29+
exhaustive search* over the space of alternative strategies. This query
30+
optimization technique is inadequate to support database application
31+
domains that evolve the need for extensive queries, such as artifcial
32+
intelligence.
33+
34+
The Institute of Automatic Control at the University of Mining and
35+
Technology Freiberg, Germany encountered the described problems as its
36+
folks wanted to take the Postgres DBMS as the backend for a decision
37+
support knowledge based system for the maintenance of an electrical
38+
power grid. The DBMS needed to handle large JOIN queries for the
39+
inference machine of the knowledge based system.
40+
41+
Performance difficulties within exploring the space of possible query
42+
plans arose the demand for a new optimization technique being developed.
43+
44+
In the following we propose the implementation of a *Genetic
45+
Algorithm* as an option for the database query optimization problem.
46+
47+
48+
2.) Genetic Algorithms (GA)
49+
===========================
50+
51+
The GA is a heuristic optimization method which operates through
52+
determined, randomized search. The set of possible solutions for the
53+
optimization problem is considered as a *population* of *individuals*.
54+
The degree of adaption of an individual to its environment is specified
55+
by its *fitness*.
56+
57+
The coordinates of an individual in the search space are represented
58+
by *chromosomes*, in essence a set of character strings. A *gene* is a
59+
subsection of a chromosome which encodes the value of a single parameter
60+
being optimized. Typical encodings for a gene could be *binary* or
61+
*integer*.
62+
63+
Through simulation of the evolutionary operations *recombination*,
64+
*mutation*, and *selection* new generations of search points are found
65+
that show a higher average fitness than their ancestors.
66+
67+
According to the "comp.ai.genetic" FAQ it cannot be stressed too
68+
strongly that a GA is not a pure random search for a solution to a
69+
problem. A GA uses stochastic processes, but the result is distinctly
70+
non-random (better than random).
71+
72+
Structured Diagram of a GA:
73+
---------------------------
74+
75+
P(t) generation of ancestors at a time t
76+
P''(t) generation of descendants at a time t
77+
78+
+=========================================+
79+
|>>>>>>>>>>> Algorithm GA <<<<<<<<<<<<<<|
80+
+=========================================+
81+
| INITIALIZE t := 0 |
82+
+=========================================+
83+
| INITIALIZE P(t) |
84+
+=========================================+
85+
| evalute FITNESS of P(t) |
86+
+=========================================+
87+
| while not STOPPING CRITERION do |
88+
| +-------------------------------------+
89+
| | P'(t) := RECOMBINATION{P(t)} |
90+
| +-------------------------------------+
91+
| | P''(t) := MUTATION{P'(t)} |
92+
| +-------------------------------------+
93+
| | P(t+1) := SELECTION{P''(t) + P(t)} |
94+
| +-------------------------------------+
95+
| | evalute FITNESS of P''(t) |
96+
| +-------------------------------------+
97+
| | t := t + 1 |
98+
+===+=====================================+
99+
100+
101+
3.) Genetic Query Optimization (GEQO) in PostgreSQL
102+
===================================================
103+
104+
The GEQO module is intended for the solution of the query
105+
optimization problem similar to a traveling salesman problem (TSP).
106+
Possible query plans are encoded as integer strings. Each string
107+
represents the JOIN order from one relation of the query to the next.
108+
E. g., the query tree /\
109+
/\ 2
110+
/\ 3
111+
4 1 is encoded by the integer string '4-1-3-2',
112+
which means, first join relation '4' and '1', then '3', and
113+
then '2', where 1, 2, 3, 4 are relids in PostgreSQL.
114+
115+
Parts of the GEQO module are adapted from D. Whitley's Genitor
116+
algorithm.
117+
118+
Specific characteristics of the GEQO implementation in PostgreSQL
119+
are:
120+
121+
o usage of a *steady state* GA (replacement of the least fit
122+
individuals in a population, not whole-generational replacement)
123+
allows fast convergence towards improved query plans. This is
124+
essential for query handling with reasonable time;
125+
126+
o usage of *edge recombination crossover* which is especially suited
127+
to keep edge losses low for the solution of the TSP by means of a GA;
128+
129+
o mutation as genetic operator is deprecated so that no repair
130+
mechanisms are needed to generate legal TSP tours.
131+
132+
The GEQO module gives the following benefits to the PostgreSQL DBMS
133+
compared to the Postgres query optimizer implementation:
134+
135+
o handling of large JOIN queries through non-exhaustive search;
136+
137+
o improved cost size approximation of query plans since no longer
138+
plan merging is needed (the GEQO module evaluates the cost for a
139+
query plan as an individual).
140+
141+
142+
References
143+
==========
144+
145+
J. Heitk"otter, D. Beasley:
146+
---------------------------
147+
"The Hitch-Hicker's Guide to Evolutionary Computation",
148+
FAQ in 'comp.ai.genetic',
149+
'ftp://ftp.Germany.EU.net/pub/research/softcomp/EC/Welcome.html'
150+
151+
Z. Fong:
152+
--------
153+
"The Design and Implementation of the Postgres Query Optimizer",
154+
file 'planner/Report.ps' in the 'postgres-papers' distribution
155+
156+
R. Elmasri, S. Navathe:
157+
-----------------------
158+
"Fundamentals of Database Systems",
159+
The Benjamin/Cummings Pub., Inc.
160+

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp