Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitda11977

Browse files
committed
Reduce memory usage of tsvector type analyze function.
compute_tsvector_stats() detoasted and kept in memory every tsvector valuein the sample, but that can be a lot of memory. The original bug reportdescribed a case using over 10 gigabytes, with statistics target of 10000(the maximum).To fix, allocate a separate copy of just the lexemes that we keep around,and free the detoasted tsvector values as we go. This adds some palloc/pfreeoverhead, when you have a lot of distinct lexemes in the sample, but it'sbetter than running out of memory.Fixes bug #14654 reported by James C. Reviewed by Tom Lane. Backport toall supported versions.Discussion:https://www.postgresql.org/message-id/20170514200602.1451.46797@wrigleys.postgresql.org
1 parentca793c5 commitda11977

File tree

1 file changed

+17
-4
lines changed

1 file changed

+17
-4
lines changed

‎src/backend/tsearch/ts_typanalyze.c

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -232,17 +232,20 @@ compute_tsvector_stats(VacAttrStats *stats,
232232

233233
/*
234234
* We loop through the lexemes in the tsvector and add them to our
235-
* tracking hashtable. Note: the hashtable entries will point into
236-
* the (detoasted) tsvector value, therefore we cannot free that
237-
* storage until we're done.
235+
* tracking hashtable.
238236
*/
239237
lexemesptr=STRPTR(vector);
240238
curentryptr=ARRPTR(vector);
241239
for (j=0;j<vector->size;j++)
242240
{
243241
boolfound;
244242

245-
/* Construct a hash key */
243+
/*
244+
* Construct a hash key. The key points into the (detoasted)
245+
* tsvector value at this point, but if a new entry is created, we
246+
* make a copy of it. This way we can free the tsvector value
247+
* once we've processed all its lexemes.
248+
*/
246249
hash_key.lexeme=lexemesptr+curentryptr->pos;
247250
hash_key.length=curentryptr->len;
248251

@@ -261,6 +264,9 @@ compute_tsvector_stats(VacAttrStats *stats,
261264
/* Initialize new tracking list element */
262265
item->frequency=1;
263266
item->delta=b_current-1;
267+
268+
item->key.lexeme=palloc(hash_key.length);
269+
memcpy(item->key.lexeme,hash_key.lexeme,hash_key.length);
264270
}
265271

266272
/* lexeme_no is the number of elements processed (ie N) */
@@ -276,6 +282,10 @@ compute_tsvector_stats(VacAttrStats *stats,
276282
/* Advance to the next WordEntry in the tsvector */
277283
curentryptr++;
278284
}
285+
286+
/* If the vector was toasted, free the detoasted copy. */
287+
if (TSVectorGetDatum(vector)!=value)
288+
pfree(vector);
279289
}
280290

281291
/* We can only compute real stats if we found some non-null values. */
@@ -447,9 +457,12 @@ prune_lexemes_hashtable(HTAB *lexemes_tab, int b_current)
447457
{
448458
if (item->frequency+item->delta <=b_current)
449459
{
460+
char*lexeme=item->key.lexeme;
461+
450462
if (hash_search(lexemes_tab, (constvoid*)&item->key,
451463
HASH_REMOVE,NULL)==NULL)
452464
elog(ERROR,"hash table corrupted");
465+
pfree(lexeme);
453466
}
454467
}
455468
}

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp