googleapis/google-cloud-pythonPublic

NotificationsYou must be signed in to change notification settings
Fork1.6k
Star5.2k

BigQuery: Add list rows and --max_results option to %%bigquery magic#9147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Closed

shubha-rajan wants to merge13 commits intogoogleapis:masterfromshubha-rajan:bq-add-list-rows-to-magic

Closed

BigQuery: Add list rows and --max_results option to %%bigquery magic#9147

shubha-rajan wants to merge13 commits intogoogleapis:masterfromshubha-rajan:bq-add-list-rows-to-magic

Conversation

Copy link

Contributor

shubha-rajan commentedAug 29, 2019•
edited
Loading

See#9105. Creating draft for visibility.

Adds option to pass a table id instead of a SQL query to%%bigquery cell magic as a cost-saving alternative toSELECT * queries.--max_results option limits the number of rows read. The returnedpandas.DataFrame can be saved to a variable by passing adestination_var argument.

TODO:

Fix coverage failures
Test that running cell magic with table ID instead of query works withbqstorage_client set
Add tests for failure cases- handles failure cases when table IDs are passed instead of queries:
~~max_results is currently not working with regular SQL queries~~ - fixed! See screenshot below

To get this working, I ended up addingmax_results as a property ofQueryJobConfig, but if that wasn't the right call, I can refactor to passmax_results as a separate parameter.

shubha-rajan added3 commits

August 27, 2019 23:21

added max_results flag and property to QueryJobConfig

9ca87b2

tests for setting max_results and using table_id instead of query pas…

718bb2e

…sing

adjusted regex whitespace check to account for trailing newline added…

5eb56ce

… by notebook cell

googlebot added the cla: yesThis human has signed the Contributor License Agreement. label

Aug 29, 2019

shubha-rajan added4 commits

August 29, 2019 23:44

preserve value of max_results after QueryJob._set_properties called

904c31b

added tests for using --max_results with destination_var and bq_stora…

53d0b0b

…ge_api

blacken and lint

3706a0a

removed unused max_results parameter from client._get_query_results

0e9e032

plamut reviewed

Aug 30, 2019

View reviewed changes

bigquery/google/cloud/bigquery/job.py OutdatedShow resolvedHide resolved

shubha-rajan added2 commits

August 30, 2019 13:43

added error messaging and tests for failure case

4756ec3

reformatted docstrings

304c761

plamut reviewed

Aug 30, 2019

View reviewed changes

bigquery/google/cloud/bigquery/job.py Outdated

		:type api_response: dict
		:paramapi_response: response returned from an API call
		Args:
		api_response (dict): response returned from an API call.

Copy link

Contributor

plamutAug 30, 2019•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

By the way, the types are specified with names from thetyping module, thus adict should be namedDict, for example. OrDict[key_type, value_type] if you also want to specify the dict content's type(s). Probably best to check some of the existing "modern" docstrings in the codebase to get a feel.

Copy link

ContributorAuthor

shubha-rajanAug 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

got it. Thedict in question would be a nested API response so it would be okay to just name itDict without specifying the content, right?

Copy link

Contributor

plamutSep 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I suppose so, yes. If all keys are strings,Dict[str, Any] could be used, but justDict with a meaningful description is fine, too.

shubha-rajan added2 commits

August 30, 2019 14:18

Fix docstring formatting

8d0781d

blacken/lint

b93b01d

Copy link

ContributorAuthor

shubha-rajan commentedAug 30, 2019

failing snippets tests also fail locally on master, so they're probably unrelated to changes in this PR

shubha-rajan marked this pull request as ready for review

August 30, 2019 21:55

shubha-rajan requested a review froma team

August 30, 2019 21:55

shubha-rajan added2 commits

August 30, 2019 19:43

update error messaging test

32a1ebc

refactored error message display into its own method.fixed coverage f…

aceee81

…ailure

Copy link

Contributor

plamut commentedAug 31, 2019

@shubha-rajan Indeed, that started occurring a day or two ago. The backend team has been informed about it, we are awaiting the ETA for the fix. If it's too long, we can temporarily disable the failing test as a workaround.

Copy link

Contributor

tswast commentedSep 3, 2019

Since these are two different features, let's have them as two (possibly 3) separate PRs, starting with--max_results feature. That way they are more clearly identified as new features in the CHANGELOG when we release these features.

I'd prefer we find a different implementation formax_results. Note:list_rowsaccepts amax_results argument, andQueryJob.resultcalls thelist_rows method. I think it would be appropriate to add amax_results argument toQueryJob.result.

Let's have 3 PRs in this order: