Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Addstruct_values function to return field values of STRUCT#19968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
EtgarDev wants to merge2 commits intoduckdb:main
base:main
Choose a base branch
Loading
fromEtgarDev:feature/etgarsh/struct-values

Conversation

@EtgarDev
Copy link
Contributor

This PR adds a new scalar functionstruct_values that complements the existingstruct_keys function. Whilestruct_keys returns the names of a struct's fields,struct_values returns the corresponding values as an unnamed struct.

Example

SELECT struct_values({'a':1,'b':2,'c':3});-- Result: (1, 2, 3)-- Type: ROW(INTEGER, INTEGER, INTEGER)

@EtgarDevEtgarDev changed the titleAddstruct_values function to return field values of STRUCT as Unna…Addstruct_values function to return field values of STRUCTNov 27, 2025
Copy link
Contributor

@TishjTishj left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks! It looks great already, I don't see any glaring mistakes, just had a few comments

} else {
result.SetVectorType(VectorType::FLAT_VECTOR);

// Make result validity to mirror input's nulls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We can use AllValid on the input to determine if the validity is entirely clean, in which case we can avoid this step of setting the result validity (and we also don't need the EnsureWritable() call in that case)


// Ensure input is a STRUCT, set return type to an unnamed STRUCT with same child types
static unique_ptr<FunctionData> StructValuesBind(ClientContext &context, ScalarFunction &bound_function,
vector<unique_ptr<Expression>> &arguments) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is there a test with a prepared parameter? If there isn't, can we add one? I suspect there's logic missing here to handle that (checking for UNKNOWN type in the input - see other bind methods)


ScalarFunction StructValuesFun::GetFunction() {
ScalarFunction func({LogicalType::ANY}, LogicalTypeId::STRUCT, StructValuesFunction, StructValuesBind);
func.SetNullHandling(FunctionNullHandling::SPECIAL_HANDLING);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you make the case for using SPECIAL_HANDLING?
Perhaps have a look at the comments on this enum, because I think this resembles the "default" handling


# Constant vector as input (same result for every row)
statement ok
CREATE TABLE t_struct_values_constant(x INTEGER);
Copy link
Contributor

@TishjTishjNov 28, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not really following the creation of this table? The best way to test constant vectors is to not even use a table.

Easiest isselect <constant expression> from range(3)


statement ok
INSERT INTO t_struct_values_flat VALUES
(ROW(1, 'x')::STRUCT(a INT, b VARCHAR), 0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You can remove the cast, should be added by the insert already, makes the test a bit more readable :)

I'm even pretty sure the use of ROW can be removed as well, just(1, 'x') should be enough to create the ROW value and let it cast to the STRUCT type of the column


static void StructValuesFunction(DataChunk &args, ExpressionState &state, Vector &result) {
D_ASSERT(args.ColumnCount() == 1);
auto &input = args.data[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Since we accept unnamed structs, we could have an early out for that case, by detecting theStructType::IsUnnamed(input.GetType()) and just directly referencing the input

input.ToUnifiedFormat(count, input_data);
auto &validity = FlatVector::Validity(result);

validity.EnsureWritable();
Copy link
Contributor

@TishjTishjNov 28, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think we don't need this,SetInvalid will ensure this already

In the scenario where the input rows turn out to all be valid, if we use this,AllValid() will not be true for the result, because it checks whether EnsureWritable was ever called (explicitly or through methods likeSetInvalid)

So we would be missing out on fast paths in downstream function calls

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Happy to elaborate if this is too vague of an explanation btw 👍

Copy link
Contributor

@maiadegraafmaiadegraaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks, looking good already! Just a few comments from my side.

Comment on lines +48 to +49
if (arguments[0]->return_type.id() != LogicalTypeId::STRUCT) {
throw InvalidInputException("struct_values() expects a STRUCT argument");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Out of curriosity what happens if you don't check this? I think it should be picked up by the binder.

Eg,struct_contains doesn't explicitly check for structs but still throws an error if a non-struct is passed as a parameter.

Dselect struct_contains(1);Binder Error:No function matches the given nameand argument types'struct_contains(INTEGER_LITERAL)'. You might need to add explicit type casts.                                                                                                                                                                                          Candidate functions:        struct_contains(STRUCT, ANY)->BOOLEANLINE1:select struct_contains(1);               ^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Could we add some more tests?

  • struct with a null value ({a: NULL, b: 2})
  • struct with a nested struct
  • nestedstruct_value calls

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

2 more reviewers

@TishjTishjTishj requested changes

@maiadegraafmaiadegraafmaiadegraaf left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

4 participants

@EtgarDev@Tishj@maiadegraaf@taniabogatsch

[8]ページ先頭

©2009-2025 Movatter.jp