Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add PostgresViewEntry#183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
Tishj wants to merge15 commits intoduckdb:main
base:main
Choose a base branch
Loading
fromTishj:add_view_entry
Draft

Conversation

Tishj
Copy link
Contributor

@TishjTishj commentedFeb 20, 2024
edited
Loading

In the table query we get both tables, views, materialized views, partitioned tables and foreign tables.
The views are then stored in the catalog as a TableCatalogEntry, which confusesduckdb_tables() andduckdb_views() which is used insideinformation_schema.tables

So the views showed up as tables, this PR fixes that and adds a test for it

Notable change:

In postgres, when views are created on a dummy binding, with no aliases, their resulting alias becomes"?column?":

(lldb) p view_catalog_entry.names(duckdb::vector<std::string>) {  std::__1::vector<std::__1::string, std::__1::allocator<std::__1::string> > = size=1 {    [0] = "?column?"  }}

As a result, our binding will fail because the names don't match.
To get around this, we don't save the entry as "?column?" but instead use the name that DuckDB would attribute to the expression

public:
PostgresViewInfo() {
create_info = make_uniq<CreateViewInfo>();
// create_info->columns.SetAllowDuplicates(true);
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not sure about this, the Table has a ColumnsList which provides this option, but no such option exists for views because we don't have a ColumnsList, we instead just store the types+names separately

@Tishj
Copy link
ContributorAuthor

This is lacking testing still, and there are probably virtual methods that need to be overridden

@TishjTishj marked this pull request as draftFebruary 20, 2024 10:25
@Tishj
Copy link
ContributorAuthor

Tishj commentedMar 4, 2024
edited
Loading

The failure is probably related to:

statement okCREATEVIEWs1.v1ASSELECT42as j

Our error check is probably not a fan of this, because there is no existing table to reference

Or actually..
Bothtest_view.test andattach_views.test create or replacev1.
Maybe these are somehow happening concurrently, messing with the persistent postgres database file?

Copy link
Contributor

@MytherinMytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you merge with main? Also one minor comment:

: ViewCatalogEntry(catalog, schema, *info.create_info), postgres_types(std::move(info.postgres_types)),
postgres_names(std::move(info.postgres_names)) {
D_ASSERT(postgres_types.size() == types.size());
approx_num_pages = info.approx_num_pages;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

approx_num_pages should be 0 for a view always (or rather it probably shouldn't be a property at all?) - we use this to partition base tables but that is not possible for views

@Tishj
Copy link
ContributorAuthor

@Mytherin This seems like a problem, I can't really find the reason for this behavior

# Not aliased (no expression alias)statement okCREATE VIEW s1.view_aliases.x AS SELECT 'x'query Iselect x from s1.view_aliases.x----{'CAST('x' AS VARCHAR)': x}

I've had to replace the CreateViewInfo names with the expression.ToString() if the name was "?column?", this seems to have this effect

@Mytherin
Copy link
Contributor

The fact that we are getting two different names (one from Postgres', one from DuckDB') indicates to me we are binding the same query both in Postgres and in DuckDB. That seems like a more fundamental problem to me. After all - if a view contains anything that DuckDB does not support, the binding in DuckDB will fail, whereas we should still be able to query the view.

Maybe we could try e.g. not pulling the SQL definition of the view into DuckDB so the binding is avoided?

@Mytherin
Copy link
Contributor

An example of a view that works in Postgres but wouldn't work in DuckDB might be:

createtablemy_table(iint);createviewmy_viewasselect ctidfrom my_table;

@Tishj
Copy link
ContributorAuthor

Tishj commentedMar 27, 2024
edited
Loading

An example of a view that works in Postgres but wouldn't work in DuckDB might be:

createtablemy_table(iint);createviewmy_viewasselect ctidfrom my_table;

You're right, that doesn't work
But that also doesn't work forselect ctid from s1.my_table, because we've never registeredctid as being part ofmy_table in our custom postgres catalog

That's another issue we can solve by turning the columns into separate catalog entries, we could add default entries for things likectid.
And come to think of it,rowid as well for our internal use

@Tishj
Copy link
ContributorAuthor

The problem with this lies inBinder::Bind(BaseTableRef &ref)
Even if I remove the query from the created ViewCatalogEntry, this code looks up the catalog entry, binds it and compares the new results versus the old results, to throw the"Contents of view were altered: types don't match!" error.

(also setting the query to nullptr causes an InternalException here currently because this path assumes the query is always present)
Perhaps we want to just not do this verification step if the saved query is nullptr?

@Tishj
Copy link
ContributorAuthor

Tishj commentedMar 27, 2024
edited
Loading

The problem here is that we first bind the view's query before we hand it off to postgres, so we have to be able to bind it ourselves (possibly delegating to the postgres catalog extension)

Then we create the entry in the postgres catalog, running the same query, then reloading to get the entry as it was created in postgres.
This saves it as "?column?" because that's what postgres assigned to it.

Then in our verification step inside Bind(BaseTableRef &ref) we bind the query again on duckdb's side, which produces a different name for the column.


The "skip view content verification ifquery is nullptr" solution could work in that case, because we won't bind the query on our side again.
That still leaves the problem of the first bind though, if system columns are referenced that we haven't explicitly made a part of the TableCatalogEntry the bind will fail (as evidenced by thecreate view my_view as select ctid from my_table example)

One solution there might be decompose the TableCatalogEntry into a catalog set containing columns, and using a DefaultGenerator to create the system columns
Another could be to always add the system columns manually in our (postgres)TableCatalogEntry creation code.

A more generic approach that will likely take the least effort is to add a virtual method on the table catalog entry for column retrieval, so we can hook in and return a column definition for "ctid" and friends, basically a poor mans "DefaultColumnGenerator"


The last idea to make this work is take a similar approach asBindCreateIndex does and let BindCreateView be a virtual method on Catalog so we can override the behavior in the PostgresCatalog
Skipping the initial bind, and also the rebind that is currently done to verify that "View contents were altered" did not occur

@TishjTishj self-assigned thisApr 18, 2024
@Tishj
Copy link
ContributorAuthor

Tishj commentedApr 18, 2024
edited
Loading

We're likely going to have to separate out the SelectStatement inside the CreateViewInfo, then construct a postgres query from that, get the names + types from the PGresult object, and convert the type into a LogicalType

Oid PQftype(const PGresult *res,            int column_number);

That gets us the Oid, then we can run another query to retrieve the information we need to transform PG -> LogicalType

One downside of this is that we have to actually execute the query to get the names+types
I have seen other solutions which involve creating a prepared statement and querying metadata from it, that might be interesting?

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@MytherinMytherinMytherin left review comments

Assignees

@TishjTishj

Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

2 participants
@Tishj@Mytherin

[8]ページ先頭

©2009-2025 Movatter.jp