- Notifications
You must be signed in to change notification settings - Fork321
Closed
Description
Environment details
- OS type and version: macOS Catalina (10.15.5)
- Python version:
python --version: Python 3.7.6 - pip version:
pip --version: pip 20.0.2 google-cloud-bigqueryversion:pip show google-cloud-bigquery
Name: google-cloud-bigqueryVersion: 1.24.0Summary: Google BigQuery API client libraryHome-page: https://github.com/GoogleCloudPlatform/google-cloud-pythonAuthor: Google LLCAuthor-email: googleapis-packages@google.comLicense: Apache 2.0Location: /Users/swast/miniconda3/envs/ibis-dev/lib/python3.7/site-packagesRequires: google-cloud-core, google-api-core, google-resumable-media, google-auth, protobuf, sixCode example
Code:
fromgoogle.cloudimportbigqueryclient=bigquery.Client()df=client.query("SELECT TIMESTAMP '4567-01-01 00:00:00' AS `tmp`").to_dataframe()
Stack trace
---------------------------------------------------------------------------ArrowInvalid Traceback (most recent call last)<ipython-input-3-6b8b40790c39> in <module>----> 1 df = client.query("SELECT TIMESTAMP '4567-01-01 00:00:00' AS `tmp`").to_dataframe()~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/google/cloud/bigquery/job.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client) 3372 dtypes=dtypes, 3373 progress_bar_type=progress_bar_type,-> 3374 create_bqstorage_client=create_bqstorage_client, 3375 ) 3376 ~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client) 1729 create_bqstorage_client=create_bqstorage_client, 1730 )-> 1731 df = record_batch.to_pandas() 1732 for column in dtypes: 1733 df[column] = pandas.Series(df[column], dtype=dtypes[column])~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/array.pxi in pyarrow.lib._PandasConvertible.to_pandas()~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.Table._to_pandas()~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/pandas_compat.py in table_to_blockmanager(options, table, categories, ignore_metadata, types_mapper) 764 _check_data_column_metadata_consistency(all_columns) 765 columns = _deserialize_column_index(table, all_columns, column_indexes)--> 766 blocks = _table_to_blocks(options, table, categories, ext_columns_dtypes) 767 768 axes = [columns, index]~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/pandas_compat.py in _table_to_blocks(options, block_table, categories, extension_columns) 1100 columns = block_table.column_names 1101 result = pa.lib.table_to_blocks(options, block_table, categories,-> 1102 list(extension_columns.keys())) 1103 return [_reconstruct_block(item, columns, extension_columns) 1104 for item in result]~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/table.pxi in pyarrow.lib.table_to_blocks()~/miniconda3/envs/ibis-dev/lib/python3.7/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()ArrowInvalid: Casting from timestamp[us, tz=UTC] to timestamp[ns] would result in out of bounds timestamp: 81953424000000000Potential solutions
In order of my preference:
- Catch this exception from arrow and pass in the option to arrow to use datetime objects only in that case (no option in google-cloud-bigquery): See:https://issues.apache.org/jira/browse/ARROW-5359
- Add option to use Fletcher to make a dataframe backed by the arrow tablehttps://github.com/xhochy/fletcher
- Add option to use datetime objects for timestamp/datetime columns.