Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: addjob_id,location,project, andquery_id properties onRowIterator#1733

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
Linchin merged 3 commits intomainfromb305260214-rowiterator-job-metadata
Nov 18, 2023

Conversation

@tswast
Copy link
Contributor

@tswasttswast commentedNov 17, 2023
edited
Loading

These can be used to recover the original job metadata whenRowIterator is the result of aQueryJob. This will be especially important after#1722 wherequery_and_wait() won't return an actualQueryJob object.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as abug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Towards internal issue 305260214 and related to#589
🦕

…on `RowIterator`These can be used to recover the original job metadata when `RowIterator` isthe result of a `QueryJob`.
@tswasttswast requested review froma team ascode ownersNovember 17, 2023 20:59
@product-auto-labelproduct-auto-labelbot added size: mPull request size is medium. api: bigqueryIssues related to the googleapis/python-bigquery API. labelsNov 17, 2023
@tswasttswast requested review fromLinchin andchalmerlowe and removed request forfarhan0102November 17, 2023 21:00
bqstorage_download=functools.partial(
_pandas_helpers.download_arrow_bqstorage,
self._project,
self._bqstorage_project,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks like we now have two possible project values, does this mean aRowIterator can correspond to different projects for storage APIs compared to other APIs?

@property
def_bqstorage_project(self)->Optional[str]:
"""GCP Project ID where BQ Storage API will bill to (if applicable)."""
client=self.client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I wonder what's the difference betweenself.client.project andself._project. When the client isn't of storage type, doesself.client.project still reflect the storage project id?

Copy link
ContributorAuthor

@tswasttswastNov 17, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The difference in other contexts is usually described as "billing project" versus "data project". I've renamed this private property to reflect that. This is easiest to understand in the context of public datasets.

I have permission to query tables in thebigquery-public-data project and even download the data in bulk with the BigQuery Storage Read API, but Idon't have permission to run queries in that project or start a BigQuery Storage Read API session from that project. Instead, I have start the query or bq storage read session in my own project and reference the tables in the other project.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Wow I see! soself._project is the project for the data/table, andself.client.project a.k.a. billing project is the user's own project. So I guess this billing project doesn't just apply to storage?

tswast reacted with thumbs up emoji
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That's true, though for queries we're usingjobs.getQueryResults, so the project will always be the billing project since that's where temp results for queries are stored.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

tabledata.list is free, but technically we should be overriding thex-goog-user-project header so that the quota is charged to the billing project. (I don't think we're actually doing that, though)

Co-authored-by: Lingqing Gan <lingqing.gan@gmail.com>
@LinchinLinchin merged commit494f275 intomainNov 18, 2023
@LinchinLinchin deleted the b305260214-rowiterator-job-metadata branchNovember 18, 2023 00:44

deftest_job_id_missing(self):
rows=self._make_one()
self.assertIsNone(rows.job_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@tswast

Just curious about testing philosophy.
For this collection of tests that are pretty much the same (i.e. all the tests that .assertIsNone()) is there a benefit in your mind to having each test broken out separately versus providing a parameterized list of inputs and expected outputs?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I matched the existing pattern in this test. I do think we could probably combine a lot of these into a single parametrized test.

I worry slightly about the requiredgetattr(property_name) being too complex for new contributors to understand. In general tests should avoid anything even as complex as an if statement.https://testing.googleblog.com/2014/07/testing-on-toilet-dont-put-logic-in.html In this case,getattr is a grey area. I wouldn't necessarily count it as "logic"

kiraksi pushed a commit to kiraksi/python-bigquery that referenced this pull requestNov 20, 2023
…on `RowIterator` (googleapis#1733)* feat: add `job_id`, `location`, `project`, and `query_id` properties on `RowIterator`These can be used to recover the original job metadata when `RowIterator` isthe result of a `QueryJob`.* rename bqstorage_project to billing project* Update google/cloud/bigquery/table.pyCo-authored-by: Lingqing Gan <lingqing.gan@gmail.com>---------Co-authored-by: Lingqing Gan <lingqing.gan@gmail.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@chalmerlowechalmerlowechalmerlowe left review comments

@LinchinLinchinLinchin approved these changes

Assignees

No one assigned

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.size: mPull request size is medium.

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@tswast@chalmerlowe@Linchin

[8]ページ先頭

©2009-2025 Movatter.jp