Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat(ibis): support listing tables from multiple datasets or projects for BigQuery#1391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
goldmedal wants to merge2 commits intoCanner:main
base:main
Choose a base branch
Loading
fromgoldmedal:feat/support-multiple-dataset-bigquery

Conversation

@goldmedal
Copy link
Contributor

@goldmedalgoldmedal commentedDec 17, 2025
edited by coderabbitaibot
Loading

Metadata v3 API

Listing Table

  • POST /v3/connector/bigquery/metadata/tables
  • Request body:
      {"connectionInfo": {"bigquery_type":"project","billing_project_id":"wrenai","region":"asia-east1","credentials":"..."      },"filterInfo": {"projects": [              {"projectId":"wrenai","datasetIds": ["tpch_tiny","tpch_sf1"                  ]              },              {"projectId":"bigquery-public-data","datasetIds": ["austin_311"                  ]              }          ]      }  }
  • Sample response
[  {"name":"bigquery-public-data.austin_311.311_service_requests","columns": [      {"name":"city","type":"STRING","notNull":false,"description":null,"properties": {"column_order":12        },"nestedColumns":null      },...    ],"description":null,"properties": {    },"primaryKey":""  },  {"name":"wrenai.tpch_tiny_us.orders","columns": [...    ],"description":null,"properties": {"schema":"tpch_tiny_us","catalog":"wrenai","table":"orders","path":null    },"primaryKey":""  }]

List Schema

  • POST /v3/connector/bigquery/metadata/schemas
  • Request body:
      {"connectionInfo": {"bigquery_type":"project","billing_project_id":"wrenai","region":"asia-east1","credentials":"..."      },"filterInfo": {"projects": [              {"projectId":"wrenai"              },              {"projectId":"bigquery-public-data"              }          ]      }  }
  • Sample response:
[  {"name":"wrenai","schemas": ["rudderstack_setup_test","alex_tide_issue"    ]  },  {"name":"bigquery-public-data","schemas": ["america_health_rankings","austin_311","austin_bikeshare","austin_crime",...     ]  }

BigQuery Filter Info for listing

{"projects": [        {"projectId":"wrenai","datasetIds": ["tpch_tiny"]        },        {"projectId":"bigquery-public-data","datasetIds": ["..."]        }    ]}

It can include multiple projects. Each object hasprojectId anddatasetIds fields.

  • projectId is required for all API.
  • datasetIds is required for listing tables.

New Connection Info for BigQuery

IntroduceBigQueryProjectConnectionInfo for querying multiple datasets:

{"bigquery_type":"project","billing_project_id":"wrenai","region":"asia-east1","credentials":"..."}

For BigQuery, the most important thing is the region. BigQuery only allows joining tables that are located in the same region, even if they might belong to different projects (the service account should have the corresponding permission).

BigQueryProjectConnectionInfo fields

  • billing_project_id: the billing project id of your BigQuery connection
  • region: the region of your BigQuery connection
  • credentials: Base64 encodecredentials.json

For listing public datasets

  • Because we won't have the permission to access the information schema of thebigquery-public-data project. We use the Python SDK of BigQuery to list the datasets of the public data. However, the listing API of the SDK can't apply a filter for the region, and the result doesn't include the region info. We can't ensure the region of the public data when listing datasets.
  • Therefore, when listing the tables of the public data, we will check if the dataset's data matches the region of the connection info.
  • It may cause some performance issues if the number of public datasets is large because we need to use the dataset API of BigQuery SDK to check the region of each dataset.

Summary by CodeRabbit

  • New Features

    • Multi-project and multi-dataset BigQuery support, including cross-project queries.
    • New metadata endpoints to list schemas and tables with filtering and limits (v3).
    • Option to use project-level BigQuery connection for appropriate endpoints.
  • Bug Fixes

    • Stricter validation and clearer errors for BigQuery metadata (projects, datasets, regions).
    • Preserved prior behavior on unsupported project-level metadata requests (v2).
  • Tests

    • Added tests covering cross-project queries and metadata listing scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actionsgithub-actionsbot added bigquery ibis pythonPull requests that update Python code labelsDec 17, 2025
@coderabbitai
Copy link
Contributor

coderabbitaibot commentedDec 17, 2025
edited
Loading

Walkthrough

This PR splits BigQuery connection info into dataset/project variants, propagates the discriminator union across DTOs and routers, adds multi-project/multi-dataset metadata APIs and v3 metadata endpoints, and adapts connectors, utils, and tests to handle BigQuery-specific filtering, limits, and region/permission validation.

Changes

Cohort / File(s)Summary
BigQuery Connection Model
ibis-server/app/model/__init__.py
AddedBigQueryDatasetConnectionInfo andBigQueryProjectConnectionInfo with discriminatorbigquery_type,get_billing_project_id() signatures,__hash__ implementations, andBigQueryConnectionUnion. UpdatedQueryBigQueryDTO.connection_info andConnectionInfo union.
Connector Implementation
ibis-server/app/model/connector.py
BigQueryConnector now constructs BigQuery client withproject=connection_info.get_billing_project_id() and setsclient.default_query_job_config includingjob_timeout_ms.
Data Source Validation
ibis-server/app/model/data_source.py
ReplacedBigQueryConnectionInfo import with the two variants and added conditional validation in_build_connection_info using thebigquery_type discriminator.get_bigquery_connection signature updated to accept dataset connection info.
BigQuery Metadata Implementation
ibis-server/app/model/metadata/bigquery.py
Refactored to support multi-project/multi-dataset flows:BigQueryMetadata now accepts dataset connection info, implementsget_table_list,get_schema_list,get_constraints, column/table helpers, per-project SQL generation, region validation, public dataset handling, and error propagation viaWrenError.
Metadata DTOs
ibis-server/app/model/metadata/dto.py
AddedV2MetadataDTO,FilterInfo,BigQueryFilterInfo,ProjectDatasets,Catalog, extendedMetadataDTO withtable_limit andfilter_info, and addedget_filter_info() helper.
Base Metadata API
ibis-server/app/model/metadata/metadata.py
RemovedABC inheritance; added concrete methods raisingWrenError forget_table_list,get_schema_list,get_constraints, andget_version. Imports updated forErrorCode,WrenError,Catalog,FilterInfo.
Utility Timeout Executors
ibis-server/app/util.py
execute_get_table_list_with_timeout now accepts optionalfilter_info andlimit; newexecute_get_schema_list_with_timeout added. Both detectBigQueryMetadata and call BigQuery-specific methods when applicable.
v2 Router Changes
ibis-server/app/routers/v2/connector.py
Switched endpoints to acceptV2MetadataDTO, importedBigQueryProjectConnectionInfo, and added runtime guard to reject project-level BigQuery connection info (returnsWrenError/422).
v3 Router Additions
ibis-server/app/routers/v3/connector.py
Added POST/metadata/tables and/metadata/schemas endpoints, integratedMetadataFactory,get_filter_info(), and timeout executors to fetch tables/schemas with filter and limit support.
Local Run Tool
ibis-server/tools/query_local_run.py
ReplacedBigQueryConnectionInfo withBigQueryDatasetConnectionInfo validation and switched toBigQueryConnector for connection creation.
Tests — v2 & v3
ibis-server/tests/routers/v2/...,ibis-server/tests/routers/v3/...
Added v2 test to assert project-level BigQuery connection is rejected; added v3 fixtures and multiple metadata/query tests for multi-project, multi-dataset, public datasets, and validation error cases.

Sequence Diagram(s)

sequenceDiagram    participant Client as Client    participant Router as v3 Router    participant Factory as MetadataFactory    participant Metadata as BigQueryMetadata    participant Util as Timeout Executor    participant GCP as BigQuery API    Client->>Router: POST /{ds}/metadata/tables (MetadataDTO + filterInfo, limit)    Router->>Router: build connection_info from headers    Router->>Factory: MetadataFactory.get_metadata(connection_info)    Factory->>Metadata: instantiate BigQueryMetadata(connection_info)    Router->>Router: compute filter_info via get_filter_info(dto, data_source)    Router->>Util: execute_get_table_list_with_timeout(metadata, filter_info, limit)    Util->>Util: detect BigQueryMetadata type    Util->>Metadata: call get_table_list(filter_info, limit)    Metadata->>GCP: issue per-project/dataset metadata queries (dynamic SQL)    GCP-->>Metadata: return table/column metadata    Metadata->>Metadata: transform rows -> Table/Column models    Metadata-->>Util: return list[Table]    Util-->>Router: return results (with timeout handling)    Router-->>Client: respond 200 list[Table]
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Focus review on:
    • ibis-server/app/model/metadata/bigquery.py — dynamic SQL generation, region validation, nested column handling.
    • ibis-server/app/model/__init__.py — discriminator union, hashing, and DTO type uses.
    • Router changes inv2/v3 — ensure error guards and DTO compatibility.
    • ibis-server/app/util.py — timeout wrappers and BigQuery-detection logic.
    • New/updated tests — cross-project scenarios and error expectations.

Possibly related PRs

Suggested reviewers

  • douenergy
  • wwwy3y3

Poem

🐰 I hopped through projects far and wide,

Dataset and project now side by side,
Tables, schemas bloom in multi-project light,
Routers and metadata sing through the night,
A carrot cheer for queries taking flight! 🎋

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check nameStatusExplanationResolution
Docstring Coverage⚠️ WarningDocstring coverage is 7.41% which is insufficient. The required threshold is 80.00%.You can run@coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check nameStatusExplanation
Description Check✅ PassedCheck skipped - CodeRabbit’s high-level summary is enabled.
Title check✅ PassedThe title clearly and concisely summarizes the main feature: adding support for listing tables from multiple datasets/projects in BigQuery, which is the primary change across all modified files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between79db9e4 and9a1743f.

📒 Files selected for processing (3)
  • ibis-server/app/model/__init__.py (4 hunks)
  • ibis-server/app/model/metadata/bigquery.py (2 hunks)
  • ibis-server/app/routers/v2/connector.py (6 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-02-04T12:08:44.141Z
Learnt from: goldmedalRepo: Canner/wren-engine PR: 1053File: ibis-server/app/model/__init__.py:187-196Timestamp: 2025-02-04T12:08:44.141ZLearning: In GcsFileConnectionInfo, both secret-based (key_id/secret_key) and credentials-based (service account) authentication methods are required fields.

Applied to files:

  • ibis-server/app/model/__init__.py
🧬 Code graph analysis (2)
ibis-server/app/model/__init__.py (2)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/tests/routers/v3/connector/bigquery/conftest.py (1)
  • connection_info (25-30)
ibis-server/app/model/metadata/bigquery.py (3)
ibis-server/app/model/__init__.py (5)
  • BigQueryDatasetConnectionInfo (120-131)
  • BigQueryProjectConnectionInfo (134-154)
  • get_billing_project_id (113-117)
  • get_billing_project_id (127-128)
  • get_billing_project_id (144-145)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/app/model/metadata/dto.py (9)
  • BigQueryFilterInfo (24-25)
  • Catalog (114-116)
  • Column (87-93)
  • Constraint (125-131)
  • ConstraintType (119-122)
  • FilterInfo (14-15)
  • ProjectDatasets (28-30)
  • Table (106-111)
  • TableProperties (96-103)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: ci
  • GitHub Check: ci
🔇 Additional comments (19)
ibis-server/app/model/__init__.py (6)

8-9:LGTM: Error handling imports added.

These imports are necessary for the new error handling patterns introduced in BigQuery connection variants.


113-118:LGTM: Abstract method pattern using WrenError.

UsingWrenError withNOT_IMPLEMENTED is an appropriate pattern for enforcing implementation in subclasses while providing clear error messages at runtime.


120-132:LGTM: BigQueryDatasetConnectionInfo is well-defined.

The class correctly implements:

  • Discriminator field for union type
  • get_billing_project_id() returning the project_id
  • __hash__() using all relevant secret fields

134-154:Past issue resolved:__hash__ now references only defined attributes.

The previous review flagged that__hash__ referenced undefineddata_project_ids anddataset_ids. The current implementation correctly uses onlyregion,billing_project_id, andcredentials—all of which are defined in this class.


157-160:LGTM: Discriminated union properly configured.

TheBigQueryConnectionUnion correctly uses Pydantic's discriminator pattern withbigquery_type and sets a sensible default of "dataset" for backward compatibility.


33-33:LGTM: Connection type unions updated consistently.

BothQueryBigQueryDTO.connection_info and the globalConnectionInfo union have been updated to use the new BigQuery connection variants, maintaining consistency across the codebase.

Also applies to: 620-621

ibis-server/app/routers/v2/connector.py (4)

21-21:LGTM: Required imports added for BigQuery project-level guards.

The imports enable proper type checking and error handling for the new BigQuery connection variants.

Also applies to: 29-34


267-267:LGTM: DTO type updated to V2MetadataDTO.

The parameter type change fromMetadataDTO toV2MetadataDTO aligns with the v2/v3 API split for metadata operations.

Also applies to: 296-296, 323-323


278-285:LGTM: Appropriate guard for project-level connections.

The check correctly rejectsBigQueryProjectConnectionInfo in the v2 API with a clear error message directing users to the v3 API, while maintaining backward compatibility for dataset-level connections viaMetadataFactory.


307-313:Past issue resolved: Error message now correctly references constraints endpoint.

The error message now says "metadata constraints retrieval," which is appropriate for theget_constraints endpoint. The previous review concern about a copy-paste error fromget_table_list has been addressed.

ibis-server/app/model/metadata/bigquery.py (9)

1-1:LGTM: Import updates support multi-project metadata.

The new imports enable proper exception handling (Forbidden,NotFound), connection variant support, error propagation, and BigQuery-specific filtering.

Also applies to: 4-22


43-43:LGTM: Public dataset constant defined.

TheBIGQUERY_PUBLIC_DATASET_PROJECT_ID constant enables special handling for public datasets that require different access patterns.


47-49:Signature change restricts to dataset-level connections.

The__init__ signature now requiresBigQueryDatasetConnectionInfo instead of the baseBigQueryConnectionInfo. This aligns with the PR's goal of distinguishing dataset vs. project-level metadata handling.


127-187:LGTM: Result processing handles multi-project scenarios correctly.

The logic properly:

  • Unions multiple project queries with optional limit
  • Determines when to include catalog/schema in table names based on project/dataset counts
  • Processes nested columns and builds the table hierarchy
  • Provides clear error messaging when no tables are found

189-197:LGTM: Region validation for public datasets.

The_validate_dataset_region method correctly validates that public dataset regions match the connection region, which is essential for cross-project join compatibility as noted in the PR objectives.


199-271:LGTM: Schema listing supports multiple projects.

Theget_schema_list implementation:

  • Handles both dataset and project-level connections appropriately
  • Uses region qualifiers for project-level queries
  • Includes special handling for public datasets via the BigQuery SDK
  • Groups results by catalog with proper error handling

273-338:LGTM: Helper methods for table/column transformation.

The helper methods (get_column,get_table,is_root_column,has_nested_columns,_format_compact_table_name,find_parent_column) provide clean abstractions for processing BigQuery's nested column structures and generating consistent table representations.


340-417:LGTM: Constraints retrieval handles access errors gracefully.

Theget_constraints method:

  • Iterates through project datasets with appropriate qualifiers
  • CatchesForbidden/NotFound exceptions and continues (logging warnings)
  • Properly formats constraint names and references based on multi-project/dataset context
  • Returns a comprehensive list of foreign key constraints

419-476:LGTM: Helper methods provide robust validation.

The helper methods properly:

  • _get_project_datasets: Returns appropriate project/dataset structures for both connection types
  • _get_schema_qualifier: Constructs correct qualifiers based on connection type
  • _validate_project_datasets_for_table_list: Performs defensive validation with clear error messages

Thanks for usingCodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment@coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitaicoderabbitaibot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ibis-server/app/routers/v2/connector.py (1)

321-330:Missing guard for BigQueryProjectConnectionInfo.

Theget_db_version endpoint acceptsV2MetadataDTO but lacks the guard thatget_table_list andget_constraints have. IfBigQueryProjectConnectionInfo is not supported by v2, this endpoint should also reject it for consistency.

 async def get_db_version(     data_source: DataSource,     dto: V2MetadataDTO,     headers: Annotated[Headers, Depends(get_wren_headers)] = None, ) -> str:     connection_info = data_source.get_connection_info(         dto.connection_info, dict(headers)     )+    if isinstance(connection_info, BigQueryProjectConnectionInfo):+        raise WrenError(+            ErrorCode.INVALID_CONNECTION_INFO,+            "BigQuery project-level connection info is only supported by v3 API for metadata version retrieval.",+        )     metadata = MetadataFactory.get_metadata(data_source, connection_info)     return await execute_get_version_with_timeout(metadata)
🧹 Nitpick comments (4)
ibis-server/app/model/metadata/dto.py (1)

33-36:Consider adding validation for empty or malformed filter info.

The function directly passesinfo toBigQueryFilterInfo(**info) without validation. Ifinfo isNone or missing required fields, this will raise a Pydantic validation error, which may be acceptable, but consider whether you want to handle edge cases more gracefully.

ibis-server/tests/routers/v3/connector/bigquery/test_query.py (1)

2-4:Minor import cleanup.

Thesplit function fromre is only used once (line 757, 777). Consider usingre.split directly inline instead of importing justsplit.

ibis-server/app/model/__init__.py (1)

113-117:Consider using abstract base class pattern.

The base class method raises an exception instead of being abstract. While functional, usingABC with@abstractmethod would provide clearer intent and IDE support.

ibis-server/app/model/metadata/bigquery.py (1)

259-270:Potential performance concern with public datasets.

Listing all datasets frombigquery-public-data vialist_datasets could be slow as noted in the PR objectives. Consider adding a warning log or documentation about this behavior.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between969dd82 and79db9e4.

📒 Files selected for processing (13)
  • ibis-server/app/model/__init__.py (4 hunks)
  • ibis-server/app/model/connector.py (1 hunks)
  • ibis-server/app/model/data_source.py (3 hunks)
  • ibis-server/app/model/metadata/bigquery.py (2 hunks)
  • ibis-server/app/model/metadata/dto.py (2 hunks)
  • ibis-server/app/model/metadata/metadata.py (1 hunks)
  • ibis-server/app/routers/v2/connector.py (6 hunks)
  • ibis-server/app/routers/v3/connector.py (3 hunks)
  • ibis-server/app/util.py (2 hunks)
  • ibis-server/tests/routers/v2/connector/test_bigquery.py (1 hunks)
  • ibis-server/tests/routers/v3/connector/bigquery/conftest.py (1 hunks)
  • ibis-server/tests/routers/v3/connector/bigquery/test_query.py (3 hunks)
  • ibis-server/tools/query_local_run.py (2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-02-06T07:58:58.830Z
Learnt from: goldmedalRepo: Canner/wren-engine PR: 1055File: ibis-server/app/model/connector.py:152-157Timestamp: 2025-02-06T07:58:58.830ZLearning: In BigQueryConnector, credentials with additional scopes (drive and cloud-platform) should only be created in the retry path when handling empty results with special types, not during initialization, to maintain lazy initialization.

Applied to files:

  • ibis-server/app/model/connector.py
📚 Learning: 2025-02-04T12:08:44.141Z
Learnt from: goldmedalRepo: Canner/wren-engine PR: 1053File: ibis-server/app/model/__init__.py:187-196Timestamp: 2025-02-04T12:08:44.141ZLearning: In GcsFileConnectionInfo, both secret-based (key_id/secret_key) and credentials-based (service account) authentication methods are required fields.

Applied to files:

  • ibis-server/app/model/__init__.py
🧬 Code graph analysis (9)
ibis-server/tests/routers/v2/connector/test_bigquery.py (1)
ibis-server/tests/conftest.py (1)
  • client (18-23)
ibis-server/app/util.py (3)
ibis-server/app/model/metadata/bigquery.py (3)
  • BigQueryMetadata (46-522)
  • get_table_list (51-187)
  • get_schema_list (199-271)
ibis-server/app/model/metadata/dto.py (1)
  • FilterInfo (14-15)
ibis-server/app/model/metadata/metadata.py (3)
  • get_table_list (10-13)
  • Metadata (6-24)
  • get_schema_list (21-24)
ibis-server/app/model/data_source.py (1)
ibis-server/app/model/__init__.py (2)
  • BigQueryDatasetConnectionInfo (120-131)
  • BigQueryProjectConnectionInfo (134-156)
ibis-server/app/routers/v3/connector.py (4)
ibis-server/app/model/metadata/dto.py (3)
  • Catalog (114-116)
  • MetadataDTO (18-21)
  • Table (106-111)
ibis-server/app/model/metadata/factory.py (2)
  • MetadataFactory (43-62)
  • get_metadata (45-62)
ibis-server/app/util.py (2)
  • execute_get_schema_list_with_timeout (377-398)
  • execute_get_table_list_with_timeout (353-374)
ibis-server/app/model/metadata/metadata.py (2)
  • get_table_list (10-13)
  • get_schema_list (21-24)
ibis-server/app/model/connector.py (2)
ibis-server/tests/routers/v3/connector/bigquery/conftest.py (1)
  • connection_info (25-30)
ibis-server/app/model/__init__.py (3)
  • get_billing_project_id (113-117)
  • get_billing_project_id (127-128)
  • get_billing_project_id (144-145)
ibis-server/tools/query_local_run.py (2)
ibis-server/app/model/__init__.py (1)
  • BigQueryDatasetConnectionInfo (120-131)
ibis-server/app/model/connector.py (1)
  • BigQueryConnector (506-549)
ibis-server/app/model/__init__.py (1)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/app/model/metadata/dto.py (1)
ibis-server/app/model/data_source.py (1)
  • DataSource (63-224)
ibis-server/app/model/metadata/metadata.py (2)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/app/model/metadata/dto.py (4)
  • Catalog (114-116)
  • Constraint (125-131)
  • FilterInfo (14-15)
  • Table (106-111)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: ci
  • GitHub Check: ci
🔇 Additional comments (30)
ibis-server/tests/routers/v2/connector/test_bigquery.py (1)

395-411:LGTM! Well-structured test for v2 project connection guard.

The test correctly validates that v2 API rejects BigQuery project-level connection info and returns the expected error message. The connection info structure properly matches BigQueryProjectConnectionInfo with all required fields.

ibis-server/app/model/connector.py (1)

524-530:LGTM! Project-scoped BigQuery client initialization.

The explicit project parameter fromconnection_info.get_billing_project_id() and default QueryJobConfig with timeout are correctly configured. This aligns with the new BigQuery connection info variants that distinguish dataset vs project contexts.

ibis-server/tests/routers/v3/connector/bigquery/conftest.py (1)

33-40:LGTM! Project-level connection fixture properly configured.

The new fixture provides project-level BigQuery connection info with appropriate defaults and environment variable sourcing, complementing the existing dataset-level fixture.

ibis-server/tools/query_local_run.py (2)

18-19:Updated imports for BigQuery dataset connection handling.

The imports correctly reflect the new BigQuery connection type system with BigQueryDatasetConnectionInfo and BigQueryConnector.


94-95:LGTM! BigQuery connection creation updated to use new types.

The code correctly usesBigQueryDatasetConnectionInfo.model_validate_json andBigQueryConnector instead of the previous generic types, aligning with the new dataset/project connection distinction.

ibis-server/app/model/data_source.py (3)

19-20:Updated imports for BigQuery connection variants.

The imports correctly bring in the new BigQueryDatasetConnectionInfo and BigQueryProjectConnectionInfo types that enable dataset vs project-level connection handling.


156-158:LGTM! Conditional validation for BigQuery connection types.

The logic correctly distinguishes between project-level and dataset-level connections based on thebigquery_type field, routing to the appropriate connection info class.


318-318:Function signature updated for dataset-specific connections.

Theget_bigquery_connection signature correctly specifiesBigQueryDatasetConnectionInfo, as this function is specifically for dataset-level Ibis backend connections (project-level connections are handled separately via BigQueryConnector).

ibis-server/app/routers/v3/connector.py (3)

35-36:New imports for metadata operations.

The imports correctly bring in the necessary DTOs (Catalog, MetadataDTO, Table), helper functions (get_filter_info, MetadataFactory), and timeout executors for the new metadata endpoints.

Also applies to: 45-46


556-579:LGTM! Well-structured table metadata endpoint.

The endpoint follows the established pattern:

  • Proper tracing with span
  • Builds connection info with headers
  • Uses MetadataFactory for metadata retrieval
  • Parses filter info from DTO
  • Executes with timeout control

The implementation is clean and consistent with other v3 endpoints.


582-605:LGTM! Schema metadata endpoint mirrors table endpoint pattern.

The schema list endpoint follows the same clean pattern as the table endpoint, ensuring consistency in the v3 metadata API design.

ibis-server/app/util.py (3)

45-46:New imports for BigQuery-aware metadata handling.

The imports bring in BigQueryMetadata and FilterInfo to enable conditional execution paths for BigQuery-specific metadata operations.


353-374:LGTM! BigQuery-aware table list execution with backward compatibility.

The function correctly:

  • Accepts optional filter_info and limit parameters
  • Conditionally invokes BigQuery-specific get_table_list when metadata is BigQueryMetadata
  • Falls back to generic path for other metadata types
  • Maintains backward compatibility for non-BigQuery data sources

377-398:LGTM! Schema list execution mirrors table list pattern.

The new function follows the same clean pattern as execute_get_table_list_with_timeout, ensuring consistency in BigQuery-aware metadata retrieval.

ibis-server/app/model/metadata/metadata.py (4)

2-3:Updated imports for error handling and new DTOs.

The imports correctly bring in ErrorCode, WrenError for proper error raising, and new DTOs (Catalog, FilterInfo) for the expanded metadata interface.


6-6:Class changed from ABC to concrete with NotImplemented methods.

The removal of ABC inheritance is appropriate since all methods now have concrete implementations that raise WrenError(ErrorCode.NOT_IMPLEMENTED). This provides better error messages than abstract method errors.


10-13:New get_table_list method with filter and limit support.

The method signature correctly accepts optional filter_info and limit parameters, enabling BigQuery's multi-project/dataset filtering while maintaining a clean base interface for subclasses to override.


21-24:New get_schema_list method completes metadata interface.

The schema list method follows the same pattern as get_table_list, providing consistent filtering capabilities across metadata operations.

ibis-server/app/routers/v2/connector.py (1)

20-34:LGTM!

The imports are well-organized and properly structured for the new BigQuery connection type handling and error management.

ibis-server/app/model/metadata/dto.py (3)

10-11:LGTM!

Clean DTO definition for v2 metadata endpoints that correctly uses the field alias.


24-30:LGTM!

The BigQueryFilterInfo and ProjectDatasets models are well-structured for the multi-project/dataset filtering use case.


114-116:LGTM!

Simple and clean model for representing catalog/schema metadata.

ibis-server/tests/routers/v3/connector/bigquery/test_query.py (4)

168-186:LGTM!

Good addition to verify multi-dataset connection info works with the existing manifest.


189-254:LGTM!

Excellent test for cross-project dataset queries, including reference to a public dataset. The test validates both the response structure and data integrity.


694-735:LGTM!

Good coverage for schema listing with assertions for expected schemas and handling of public datasets.


862-877:Duplicate test case.

This block (lines 862-877) is identical to lines 845-860, testing the same "empty projectId" scenario twice. Remove the duplicate.

     assert response.status_code == 422     assert "project id should not be empty" in response.text-    response = await client.post(-        url=f"{base_url}/metadata/tables",-        json={-            "connectionInfo": project_connection_info,-            "filterInfo": {-                "projects": [-                    {-                        "projectId": "",-                    }-                ],-            },-        },-    )--    assert response.status_code == 422-    assert "project id should not be empty" in response.text-     response = await client.post(

Likely an incorrect or invalid review comment.

ibis-server/app/model/__init__.py (2)

120-131:LGTM!

Well-structured dataset connection info with proper field definitions and hash implementation.


159-162:LGTM!

Good use of Pydantic's discriminated union pattern for type-safe connection info handling.

ibis-server/app/model/metadata/bigquery.py (2)

189-197:LGTM!

Good validation to ensure dataset region matches the connection region for cross-project queries.


468-493:LGTM!

Comprehensive validation for project and dataset IDs with clear error messages.

@goldmedal
Copy link
ContributorAuthor

BigQuery has tested locally:

poetry run pytest -m 'bigquery'=================================================================================================== test session starts ===================================================================================================platform darwin -- Python 3.11.11, pytest-8.4.2, pluggy-1.6.0rootdir: /Users/jax/git/wren-engine/ibis-serverconfigfile: pyproject.tomlplugins: anyio-4.10.0collected 409 items / 356 deselected / 53 selected                                                                                                                                                                        tests/routers/v2/connector/test_bigquery.py ...................                                                                                                                                                     [ 35%]tests/routers/v3/connector/bigquery/test_functions.py ............                                                                                                                                                  [ 58%]tests/routers/v3/connector/bigquery/test_query.py ......................

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@coderabbitaicoderabbitai[bot]coderabbitai[bot] left review comments

At least 1 approving review is required to merge this pull request.

Assignees

No one assigned

Labels

bigqueryibispythonPull requests that update Python code

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

1 participant

@goldmedal

[8]ページ先頭

©2009-2025 Movatter.jp