- Notifications
You must be signed in to change notification settings - Fork321
fix!: use nullableInt64 andboolean dtypes into_dataframe#786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
fix!: use nullableInt64 andboolean dtypes into_dataframe#786
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…frame`To override this behavior, specify the types for the desired columns with the`dtype` argument.
tswast commentedJul 20, 2021
I'll take a closer look at#776 before finishing this one, as it might mean fewer code paths to cover. I think the BQ Storage API will always be used for |
tswast commentedJul 21, 2021
I did a little bit of experimentation to see what the intermediate It appearshttps://issuetracker.google.com/144712110 was fixed for FLOAT columns in#314 as of google-cloud-bigquery >= 2.2.0 (That was technically a breaking change [oops]) I might still keep this open so that we can have some explicit tests for different data types. Also, we're relying on PyArrow -> Pandas to pick the right data types, so maybe there's some dtype defaults we can help with still. |
…python-bigquery into b144712110-nullable-pandas-types
to_dataframeto_dataframeto_dataframeto_dataframeto_dataframeInt64 andboolean dtype by default into_dataframeInt64 andboolean dtype by default into_dataframeInt64 andboolean dtypes into_dataframeUh oh!
There was an error while loading.Please reload this page.
tswast commentedAug 11, 2021
Re: system test failure: Didn't we increase the default deadline to 10 minutes? Maybe v3 branch needs a sync? |
plamut left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Two nits, but not essential, looks good.
| pip install --upgrade pandas | ||
| Alternatively, you can install the BigQuerypython client library with | ||
| Alternatively, you can install the BigQueryPython client library with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
(nit)
Since already at this, there's at least on other occurrence of "python" not capitalized (line 69), which can also be fixed.
| loss-of-precision. | ||
| Returns: | ||
| Dict[str, str]: mapping from column names to dtypes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
(nit) Can be expressed as the annotation of the function return type.
…' into b144712110-nullable-pandas-types
| ("max_results",), ((None,), (10,),)# Use BQ Storage API. # Use REST API. | ||
| ) | ||
| deftest_list_rows_nullable_scalars_dtypes(bigquery_client,scalars_table,max_results): | ||
| df=bigquery_client.list_rows( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Note to self: I'll need to exclude the INTERVAL column next time we sync with master
deps!: BigQuery Storage and pyarrow are required dependencies (#776)fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (#786) feat!: destination tables are no-longer removed by `create_job` (#891)feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (#972)fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (#972)feat!: mark the package as type-checked (#1058)feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (#1061)feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (#967)fix: improve type annotations for mypy validation (#1081)feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (#1117)docs: Add migration guide from version 2.x to 3.x (#1027)Release-As: 3.0.0
deps!: BigQuery Storage and pyarrow are required dependencies (googleapis#776)fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (googleapis#786) feat!: destination tables are no-longer removed by `create_job` (googleapis#891)feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (googleapis#972)fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (googleapis#972)feat!: mark the package as type-checked (googleapis#1058)feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (googleapis#1061)feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (googleapis#967)fix: improve type annotations for mypy validation (googleapis#1081)feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (googleapis#1117)docs: Add migration guide from version 2.x to 3.x (googleapis#1027)Release-As: 3.0.0
deps!: BigQuery Storage and pyarrow are required dependencies (googleapis#776)fix!: use nullable `Int64` and `boolean` dtypes in `to_dataframe` (googleapis#786) feat!: destination tables are no-longer removed by `create_job` (googleapis#891)feat!: In `to_dataframe`, use `dbdate` and `dbtime` dtypes from db-dtypes package for BigQuery DATE and TIME columns (googleapis#972)fix!: automatically convert out-of-bounds dates in `to_dataframe`, remove `date_as_object` argument (googleapis#972)feat!: mark the package as type-checked (googleapis#1058)feat!: default to DATETIME type when loading timezone-naive datetimes from Pandas (googleapis#1061)feat: add `api_method` parameter to `Client.query` to select `INSERT` or `QUERY` API (googleapis#967)fix: improve type annotations for mypy validation (googleapis#1081)feat: use `StandardSqlField` class for `Model.feature_columns` and `Model.label_columns` (googleapis#1117)docs: Add migration guide from version 2.x to 3.x (googleapis#1027)Release-As: 3.0.0
Uh oh!
There was an error while loading.Please reload this page.
To override this behavior, specify the types for the desired columns with the
dtypeargument.BREAKING CHANGE: uses Int64 type by default to avoid loss-of-precision in results with large integer values
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixeshttps://issuetracker.google.com/144712110 🦕
Fixes#793