Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

deps: require pyarrow for pandas support#314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

@cguardia
Copy link
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as abug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes#265 🦕

@cguardiacguardia requested a review froma teamOctober 9, 2020 04:48
@google-clagoogle-clabot added the cla: yesThis human has signed the Contributor License Agreement. labelOct 9, 2020

@unittest.skipIf(pandasisNone,"Requires `pandas`")
@unittest.skipIf(fastparquetisNone,"Requires `fastparquet`")
deftest_load_table_from_dataframe_no_pyarrow_warning(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm a bit surprised to see this test passing. I guess we still have some code that falls back to the default pandas parquet rendering?

Can you look into if we can remove that code path?

Related: We should be able to simplify this docstring now:

parquet_compression (Optional[str]):
[Beta] The compression method to use if intermittently
serializing ``dataframe`` to a parquet file.
If ``pyarrow`` and job config schema are used, the argument
is directly passed as the ``compression`` argument to the
underlying ``pyarrow.parquet.write_table()`` method (the
default value "snappy" gets converted to uppercase).
https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html#pyarrow-parquet-write-table
If either ``pyarrow`` or job config schema are missing, the
argument is directly passed as the ``compression`` argument
to the underlying ``DataFrame.to_parquet()`` method.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_parquet.html#pandas.DataFrame.to_parquet

"pyarrow >= 1.0.0, < 2.0dev",
],
"tqdm": ["tqdm >= 4.7.4, <5.0.0dev"],
"fastparquet": ["fastparquet","python-snappy","llvmlite>=0.34.0"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'd like to see us add "pyarrow" to the "pandas" extras now, since it's needed for both uploads and downloads to dataframe.

We can maybe refactor thepyarrow >=1.0.0,<2.0dev string into a variable since it's going to appear 3 times in setup.py now too

@cguardia
Copy link
ContributorAuthor

@tswast OK, this was a bit more involved than I expected at the beginning. Here goes my second attempt.

Copy link
Contributor

@tswasttswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks!

@tswasttswast changed the titlebuild: drop fastparquet from extras dependenciesdeps: require pyarrow for pandas supportOct 12, 2020
@tswasttswast added the automergeMerge the pull request once unit tests and other checks pass. labelOct 12, 2020
@gcf-merge-on-greengcf-merge-on-greenbot merged commit801e4c0 intogoogleapis:masterOct 12, 2020
@gcf-merge-on-greengcf-merge-on-greenbot removed the automergeMerge the pull request once unit tests and other checks pass. labelOct 12, 2020
gcf-merge-on-greenbot pushed a commit that referenced this pull requestOct 19, 2020
🤖 I have created a release \*beep\* \*boop\* ---## [2.2.0](https://www.github.com/googleapis/python-bigquery/compare/v2.1.0...v2.2.0) (2020-10-19)### Features* add method api_repr for table list item ([#299](https://www.github.com/googleapis/python-bigquery/issues/299)) ([07c70f0](https://www.github.com/googleapis/python-bigquery/commit/07c70f0292f9212f0c968cd5c9206e8b0409c0da))* add support for listing arima, automl, boosted tree, DNN, and matrix factorization models ([#328](https://www.github.com/googleapis/python-bigquery/issues/328)) ([502a092](https://www.github.com/googleapis/python-bigquery/commit/502a0926018abf058cb84bd18043c25eba15a2cc))* add timeout paramter to load_table_from_file and it dependent methods ([#327](https://www.github.com/googleapis/python-bigquery/issues/327)) ([b0dd892](https://www.github.com/googleapis/python-bigquery/commit/b0dd892176e31ac25fddd15554b5bfa054299d4d))* add to_api_repr method to Model ([#326](https://www.github.com/googleapis/python-bigquery/issues/326)) ([fb401bd](https://www.github.com/googleapis/python-bigquery/commit/fb401bd94477323bba68cf252dd88166495daf54))* allow client options to be set in magics context ([#322](https://www.github.com/googleapis/python-bigquery/issues/322)) ([5178b55](https://www.github.com/googleapis/python-bigquery/commit/5178b55682f5e264bfc082cde26acb1fdc953a18))### Bug Fixes* make TimePartitioning repr evaluable ([#110](https://www.github.com/googleapis/python-bigquery/issues/110)) ([20f473b](https://www.github.com/googleapis/python-bigquery/commit/20f473bfff5ae98377f5d9cdf18bfe5554d86ff4)), closes [#109](https://www.github.com/googleapis/python-bigquery/issues/109)* use version.py instead of pkg_resources.get_distribution ([#307](https://www.github.com/googleapis/python-bigquery/issues/307)) ([b8f502b](https://www.github.com/googleapis/python-bigquery/commit/b8f502b14f21d1815697e4d57cf1225dfb4a7c5e))### Performance Improvements* add size parameter for load table from dataframe and json methods ([#280](https://www.github.com/googleapis/python-bigquery/issues/280)) ([3be78b7](https://www.github.com/googleapis/python-bigquery/commit/3be78b737add7111e24e912cd02fc6df75a07de6))### Documentation* update clustering field docstrings ([#286](https://www.github.com/googleapis/python-bigquery/issues/286)) ([5ea1ece](https://www.github.com/googleapis/python-bigquery/commit/5ea1ece2d911cdd1f3d9549ee01559ce8ed8269a)), closes [#285](https://www.github.com/googleapis/python-bigquery/issues/285)* update snippets samples to support version 2.0 ([#309](https://www.github.com/googleapis/python-bigquery/issues/309)) ([61634be](https://www.github.com/googleapis/python-bigquery/commit/61634be9bf9e3df7589fc1bfdbda87288859bb13))### Dependencies* add protobuf dependency ([#306](https://www.github.com/googleapis/python-bigquery/issues/306)) ([cebb5e0](https://www.github.com/googleapis/python-bigquery/commit/cebb5e0e911e8c9059bc8c9e7fce4440e518bff3)), closes [#305](https://www.github.com/googleapis/python-bigquery/issues/305)* require pyarrow for pandas support ([#314](https://www.github.com/googleapis/python-bigquery/issues/314)) ([801e4c0](https://www.github.com/googleapis/python-bigquery/commit/801e4c0574b7e421aa3a28cafec6fd6bcce940dd)), closes [#265](https://www.github.com/googleapis/python-bigquery/issues/265)---This PR was generated with [Release Please](https://github.com/googleapis/release-please).
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@tswasttswasttswast approved these changes

Assignees

No one assigned

Labels

cla: yesThis human has signed the Contributor License Agreement.

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

drop fastparquet, add pyarrow to pandas extras deps

2 participants

@cguardia@tswast

[8]ページ先頭

©2009-2025 Movatter.jp