NotificationsYou must be signed in to change notification settings
Fork5
Star27

Commit7868590

committed

While making the seq_page_cost changes, I was struck by the fact that

cost_nonsequential_access() is really totally inappropriate for its onlyremaining use, namely estimating I/O costs in cost_sort(). The routinewas designed on the assumption that disk caching might eliminate the needfor some re-reads on a random basis, but there's nothing very random inthat sense about sort's access pattern --- it'll always be picking up theoldest outputs. If we had a good fix on the effective cache size wemight consider charging zero for I/O unless the sort temp file sizeexceeds it, but that's probably putting much too much faith in theparameter. Instead just drop the logic in favor of a fixed compromisebetween seq_page_cost and random_page_cost per page of sort I/O.

1 parentb7af62e commit7868590Copy full SHA for 7868590

File tree

1 file changed

-57

lines changed

src/backend/optimizer/path
- costsize.c

1 file changed

-57

lines changed

`‎src/backend/optimizer/path/costsize.c‎`

Lines changed: 5 additions & 57 deletions

Original file line number	Diff line number	Diff line change
`@@ -54,7 +54,7 @@`
`54`	`54`	`* Portions Copyright (c) 1994, Regents of the University of California`
`55`	`55`	`*`
`56`	`56`	`* IDENTIFICATION`
`57`		`- * $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.156 2006/06/0502:49:58 tgl Exp $`
	`57`	`+ * $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.157 2006/06/0520:56:33 tgl Exp $`
`58`	`58`	`*`
`59`	`59`	`*-------------------------------------------------------------------------`
`60`	`60`	`*/`
`@@ -175,55 +175,6 @@ cost_seqscan(Path path, PlannerInfo root,`
`175`	`175`	`path->total_cost=startup_cost+run_cost;`
`176`	`176`	`}`
`177`	`177`
`178`		`-/*`
`179`		`- * cost_nonsequential_access`
`180`		`- * Estimate the cost of accessing one page at random from a relation`
`181`		`- * (or sort temp file) of the given size in pages.`
`182`		`- *`
`183`		`- * The simplistic model that the cost is random_page_cost is what we want`
`184`		`- * to use for large relations; but for small ones that is a serious`
`185`		`- * overestimate because of the effects of caching.This routine tries to`
`186`		`- * account for that.`
`187`		`- *`
`188`		`- * Unfortunately we don't have any good way of estimating the effective cache`
`189`		`- * size we are working with --- we know that Postgres itself has NBuffers`
`190`		`- * internal buffers, but the size of the kernel's disk cache is uncertain,`
`191`		`- * and how much of it we get to use is even less certain. We punt the problem`
`192`		`- * for now by assuming we are given an effective_cache_size parameter.`
`193`		`- *`
`194`		`- * Given a guesstimated cache size, we estimate the actual I/O cost per page`
`195`		`- * with the entirely ad-hoc equations (writing relsize for`
`196`		`- * relpages/effective_cache_size):`
`197`		`- *if relsize >= 1:`
`198`		`- random_page_cost - (random_page_cost-seq_page_cost)/2 (1/relsize)`
`199`		`- *if relsize < 1:`
`200`		`- seq_page_cost + ((random_page_cost-seq_page_cost)/2) relsize ** 2`
`201`		`- * These give the right asymptotic behavior (=> seq_page_cost as relpages`
`202`		`- * becomes small, => random_page_cost as it becomes large) and meet in the`
`203`		`- * middle with the estimate that the cache is about 50% effective for a`
`204`		`- * relation of the same size as effective_cache_size. (XXX this is probably`
`205`		`- * all wrong, but I haven't been able to find any theory about how effective`
`206`		`- * a disk cache should be presumed to be.)`
`207`		`- */`
`208`		`-staticCost`
`209`		`-cost_nonsequential_access(doublerelpages)`
`210`		`-{`
`211`		`-doublerelsize;`
`212`		`-doublerandom_delta;`
`213`		`-`
`214`		`-/* don't crash on bad input data */`
`215`		`-if (relpages <=0.0\|\|effective_cache_size <=0.0)`
`216`		`-returnrandom_page_cost;`
`217`		`-`
`218`		`-relsize=relpages /effective_cache_size;`
`219`		`-`
`220`		`-random_delta= (random_page_cost-seq_page_cost)*0.5;`
`221`		`-if (relsize >=1.0)`
`222`		`-returnrandom_page_cost-random_delta /relsize;`
`223`		`-else`
`224`		`-returnseq_page_cost+random_deltarelsizerelsize;`
`225`		`-}`
`226`		`-`
`227`	`178`	`/*`
`228`	`179`	`* cost_index`
`229`	`180`	`* Determines and returns the cost of scanning a relation using an index.`
`@@ -371,10 +322,7 @@ cost_index(IndexPath path, PlannerInfo root,`
`371`	`322`
`372`	`323`	`/*`
`373`	`324`	`* min_IO_cost corresponds to the perfectly correlated case (csquared=1),`
`374`		`- * max_IO_cost to the perfectly uncorrelated case (csquared=0). Note that`
`375`		`- * we just charge random_page_cost per page in the uncorrelated case,`
`376`		`- * rather than using cost_nonsequential_access, since we've already`
`377`		`- * accounted for caching effects by using the Mackert model.`
	`325`	`+ * max_IO_cost to the perfectly uncorrelated case (csquared=0).`
`378`	`326`	`*/`
`379`	`327`	`min_IO_cost=ceil(indexSelectivityT)seq_page_cost;`
`380`	`328`	`max_IO_cost=pages_fetched*random_page_cost;`
`@@ -778,7 +726,7 @@ cost_functionscan(Path path, PlannerInfo root, RelOptInfo *baserel)`
`778`	`726`	`disk traffic = 2 relsize * ceil(logM(p / (2*work_mem)))`
`779`	`727`	`cpu = comparison_cost t * log2(t)`
`780`	`728`	`*`
`781`		`- * The disk traffic is assumed to behalf sequential andhalf random`
	`729`	`+ * The disk traffic is assumed to be3/4ths sequential and1/4th random`
`782`	`730`	`* accesses (XXX can't we refine that guess?)`
`783`	`731`	`*`
`784`	`732`	`* We charge two operator evals per tuple comparison, which should be in`
`@@ -838,9 +786,9 @@ cost_sort(Path path, PlannerInfo root,`
`838`	`786`	`else`
`839`	`787`	`log_runs=1.0;`
`840`	`788`	`npageaccesses=2.0npageslog_runs;`
`841`		`-/* Assumehalfare sequential,half are not */`
	`789`	`+/* Assume3/4ths of accessesare sequential,1/4th are not */`
`842`	`790`	`startup_cost+=npageaccesses*`
`843`		`-(seq_page_cost+cost_nonsequential_access(npages))*0.5;`
	`791`	`+(seq_page_cost0.75+random_page_cost0.25);`
`844`	`792`	`}`
`845`	`793`
`846`	`794`	`/*`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit7868590

File tree

1 file changed

1 file changed

`‎src/backend/optimizer/path/costsize.c‎`

0 commit comments