Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

No sampling over 281TB#19978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
Tmonster wants to merge2 commits intoduckdb:v1.4-andium
base:v1.4-andium
Choose a base branch
Loading
fromTmonster:fix_table_sample_reservoir
Open
Show file tree
Hide file tree
Changes from1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
PrevPrevious commit
small update
  • Loading branch information
@Tmonster
Tmonster committedNov 28, 2025
commit55eb3955bfc65baec4eb875897de1ff60b50799f
7 changes: 2 additions & 5 deletionssrc/parser/transform/helpers/transform_sample.cpp
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -44,11 +44,8 @@ unique_ptr<SampleOptions> Transformer::TransformSampleOptions(optional_ptr<duckd
} else {
// sample size is given in rows: use reservoir sampling
auto rows = sample_value.GetValue<int64_t>();
if (rows < 0) {
throw ParserException("Sample rows %lld out of range, must be bigger than or equal to 0", rows);
}
if (rows >= Allocator::MAXIMUM_ALLOC_SIZE) {
throw ParserException("Cannot sample over %d rows. Please use percentage instead",
if (rows < 0 || sample_value.GetValue<uint64_t>() >= Allocator::MAXIMUM_ALLOC_SIZE) {
Copy link
Collaborator

@MytherinMytherinNov 29, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The sample size is in rows but the allocator max size is in bytes - I suspect that sampling for281474976710656-1 rows will still throw the allocation error. Should we reduce the limit further / add a test for this?

Alternatively, we might want to not allocate all the space for the sample up-front if the sample size is very large - but that's obviously a larger change.

throw ParserException("Sample rows %lld out of range, must be between 0 and %lld", rows,
Allocator::MAXIMUM_ALLOC_SIZE);
}
result->sample_size = Value::BIGINT(rows);
Expand Down
4 changes: 2 additions & 2 deletionstest/sql/sample/test_sample_too_big.test
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -11,9 +11,9 @@ INSERT INTO t1 VALUES(1), (2), (3), (3), (5);
statement error
SELECT * FROM t1 TABLESAMPLE RESERVOIR(1222222220022220);
----
<REGEX>:.*Cannot sample over.*
<REGEX>:.*Sample rows.*out of range.*

statement error
SELECT * FROM t1 WHERE a IN (SELECT * FROM t1 TABLESAMPLE RESERVOIR(1222222220022220));
----
<REGEX>:.*Cannot sample over.*
<REGEX>:.*Sample rows.*out of range.*
Loading

[8]ページ先頭

©2009-2025 Movatter.jp