NotificationsYou must be signed in to change notification settings
Fork321
Star786

deps: require pyarrow for pandas support#314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

gcf-merge-on-green merged 4 commits intogoogleapis:masterfromcguardia:265-drop-fastparquet

Oct 12, 2020

Merged

deps: require pyarrow for pandas support#314

gcf-merge-on-green merged 4 commits intogoogleapis:masterfromcguardia:265-drop-fastparquet

Oct 12, 2020

Conversation

Copy link

Contributor

cguardia commentedOct 9, 2020

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

Make sure to open an issue as abug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes#265 🦕

build: drop fastparquet from extras dependencies

487c19d

refsgoogleapis#265

cguardia requested a review froma team

October 9, 2020 04:48

google-clabot added the cla: yesThis human has signed the Contributor License Agreement. label

Oct 9, 2020

tswast requested changes

Oct 9, 2020

View reviewed changes

tests/unit/test_client.py Outdated


		@unittest.skipIf(pandasisNone,"Requires `pandas`")
		@unittest.skipIf(fastparquetisNone,"Requires `fastparquet`")
		deftest_load_table_from_dataframe_no_pyarrow_warning(self):

Copy link

Contributor

tswastOct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm a bit surprised to see this test passing. I guess we still have some code that falls back to the default pandas parquet rendering?

Can you look into if we can remove that code path?

Related: We should be able to simplify this docstring now:

python-bigquery/google/cloud/bigquery/client.py

Lines 2134 to 2147 incbcb4b8

	parquet_compression (Optional[str]):
	[Beta] The compression method to use if intermittently
	serializing ``dataframe`` to a parquet file.

	If ``pyarrow`` and job config schema are used, the argument
	is directly passed as the ``compression`` argument to the
	underlying ``pyarrow.parquet.write_table()`` method (the
	default value "snappy" gets converted to uppercase).
	https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html#pyarrow-parquet-write-table

	If either ``pyarrow`` or job config schema are missing, the
	argument is directly passed as the ``compression`` argument
	to the underlying ``DataFrame.to_parquet()`` method.
	https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_parquet.html#pandas.DataFrame.to_parquet

setup.py

		"pyarrow >= 1.0.0, < 2.0dev",
		],
		"tqdm": ["tqdm >= 4.7.4, <5.0.0dev"],
		"fastparquet": ["fastparquet","python-snappy","llvmlite>=0.34.0"],

Copy link

Contributor

tswastOct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'd like to see us add "pyarrow" to the "pandas" extras now, since it's needed for both uploads and downloads to dataframe.

We can maybe refactor thepyarrow >=1.0.0,<2.0dev string into a variable since it's going to appear 3 times in setup.py now too

cguardia added2 commits

October 11, 2020 00:43

move pyarrow to pandas extras, remove unused code paths

08f7805

Merge branch 'master' into 265-drop-fastparquet

9fd5a8d

Copy link

ContributorAuthor

cguardia commentedOct 11, 2020

@tswast OK, this was a bit more involved than I expected at the beginning. Here goes my second attempt.

tswast approved these changes

Oct 12, 2020

View reviewed changes

Copy link

Contributor

tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks!

Merge branch 'master' into 265-drop-fastparquet

5dd13cd

tswast changed the title~~build: drop fastparquet from extras dependencies~~deps: require pyarrow for pandas support

Oct 12, 2020

tswast added the automergeMerge the pull request once unit tests and other checks pass. label

Oct 12, 2020

gcf-merge-on-greenbot merged commit801e4c0 intogoogleapis:master

Oct 12, 2020

gcf-merge-on-greenbot removed the automergeMerge the pull request once unit tests and other checks pass. label

Oct 12, 2020

gcf-merge-on-greenbot pushed a commit that referenced this pull request

Oct 19, 2020

chore: release 2.2.0 (#321)

82290c3

🤖 I have created a release \*beep\* \*boop\* ---## [2.2.0](https://www.github.com/googleapis/python-bigquery/compare/v2.1.0...v2.2.0) (2020-10-19)### Features* add method api_repr for table list item ([#299](https://www.github.com/googleapis/python-bigquery/issues/299)) ([07c70f0](https://www.github.com/googleapis/python-bigquery/commit/07c70f0292f9212f0c968cd5c9206e8b0409c0da))* add support for listing arima, automl, boosted tree, DNN, and matrix factorization models ([#328](https://www.github.com/googleapis/python-bigquery/issues/328)) ([502a092](https://www.github.com/googleapis/python-bigquery/commit/502a0926018abf058cb84bd18043c25eba15a2cc))* add timeout paramter to load_table_from_file and it dependent methods ([#327](https://www.github.com/googleapis/python-bigquery/issues/327)) ([b0dd892](https://www.github.com/googleapis/python-bigquery/commit/b0dd892176e31ac25fddd15554b5bfa054299d4d))* add to_api_repr method to Model ([#326](https://www.github.com/googleapis/python-bigquery/issues/326)) ([fb401bd](https://www.github.com/googleapis/python-bigquery/commit/fb401bd94477323bba68cf252dd88166495daf54))* allow client options to be set in magics context ([#322](https://www.github.com/googleapis/python-bigquery/issues/322)) ([5178b55](https://www.github.com/googleapis/python-bigquery/commit/5178b55682f5e264bfc082cde26acb1fdc953a18))### Bug Fixes* make TimePartitioning repr evaluable ([#110](https://www.github.com/googleapis/python-bigquery/issues/110)) ([20f473b](https://www.github.com/googleapis/python-bigquery/commit/20f473bfff5ae98377f5d9cdf18bfe5554d86ff4)), closes [#109](https://www.github.com/googleapis/python-bigquery/issues/109)* use version.py instead of pkg_resources.get_distribution ([#307](https://www.github.com/googleapis/python-bigquery/issues/307)) ([b8f502b](https://www.github.com/googleapis/python-bigquery/commit/b8f502b14f21d1815697e4d57cf1225dfb4a7c5e))### Performance Improvements* add size parameter for load table from dataframe and json methods ([#280](https://www.github.com/googleapis/python-bigquery/issues/280)) ([3be78b7](https://www.github.com/googleapis/python-bigquery/commit/3be78b737add7111e24e912cd02fc6df75a07de6))### Documentation* update clustering field docstrings ([#286](https://www.github.com/googleapis/python-bigquery/issues/286)) ([5ea1ece](https://www.github.com/googleapis/python-bigquery/commit/5ea1ece2d911cdd1f3d9549ee01559ce8ed8269a)), closes [#285](https://www.github.com/googleapis/python-bigquery/issues/285)* update snippets samples to support version 2.0 ([#309](https://www.github.com/googleapis/python-bigquery/issues/309)) ([61634be](https://www.github.com/googleapis/python-bigquery/commit/61634be9bf9e3df7589fc1bfdbda87288859bb13))### Dependencies* add protobuf dependency ([#306](https://www.github.com/googleapis/python-bigquery/issues/306)) ([cebb5e0](https://www.github.com/googleapis/python-bigquery/commit/cebb5e0e911e8c9059bc8c9e7fce4440e518bff3)), closes [#305](https://www.github.com/googleapis/python-bigquery/issues/305)* require pyarrow for pandas support ([#314](https://www.github.com/googleapis/python-bigquery/issues/314)) ([801e4c0](https://www.github.com/googleapis/python-bigquery/commit/801e4c0574b7e421aa3a28cafec6fd6bcce940dd)), closes [#265](https://www.github.com/googleapis/python-bigquery/issues/265)---This PR was generated with [Release Please](https://github.com/googleapis/release-please).

tswast mentioned this pull request

Jul 21, 2021

fix!: use nullableInt64 andboolean dtypes into_dataframe#786

Merged

4 tasks

Labels

cla: yes

This human has signed the Contributor License Agreement.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

deps: require pyarrow for pandas support#314

deps: require pyarrow for pandas support#314

Uh oh!

Conversation

cguardia commentedOct 9, 2020

Uh oh!

tswastOct 9, 2020

Choose a reason for hiding this comment

Uh oh!

tswastOct 9, 2020

Choose a reason for hiding this comment

Uh oh!

cguardia commentedOct 11, 2020

Uh oh!

tswast left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants