Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

docs: remove import bigframes.pandas as bpd boilerplate from many samples#2147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
tswast merged 86 commits intomainfromtswast-doctest-boilerplate
Oct 16, 2025

Conversation

@tswast
Copy link
Collaborator

Also, fixes several constructors that didn't take a session for compatibility with multi-session applications.

🦕

…plesAlso, fixes several constructors that didn't take a session forcompatibility with multi-session applications.
@tswasttswast requested review froma team ascode ownersOctober 7, 2025 21:56
@product-auto-labelproduct-auto-labelbot added the size: lPull request size is large. labelOct 7, 2025
@product-auto-labelproduct-auto-labelbot added api: bigqueryIssues related to the googleapis/python-bigquery-dataframes API. samplesIssues that are directly related to samples. labelsOct 7, 2025
doctest_namespace["np"]=np
doctest_namespace["pd"]=pd
doctest_namespace["pa"]=pa
doctest_namespace["bpd"]=polars_session
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I wonder if instead, we should just inject the polars session as global session? Not sure all the methods are the same, but I guess it works so far?

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Unfortunately, there's quite a bit that isn't supported yet on the Polars session. Doing it this way means that we can overridebpd to be the BQ version in the samples itself with a simple import.

Comment on lines 2290 to 2291
# These are included so that Session and bigframes.pandas can be used
# interchangeably.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

If this is purely for doctests, or can we just inject session for doctests? or are we trying to enable some other stuff

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

IMO, it's important for consistency. I actually uncovered a few cases where the session should have been supplied to things liketo_datetime() with local data but wasn't.

Comment on lines 221 to 223
)->Union[pandas.Timestamp,datetime.datetime,bigframes.series.Series]:
returnglobal_session.with_default_session(
bigframes.session.Session.to_datetime,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Not sure I understand this change/

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

to_datetime() has code paths that take local data. It was using the global session implicitly when it constructed the Series objects. Now it can take a session explicitly.

**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

why do we still need some of the bpd imports?

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is how I made some samples use the BQ session instead of the Polars session. In this case,hash() is unimplemented:

third_party/bigframes_vendored/pandas/core/generic.py ..........F.                                                                 [100%]================================================================ FAILURES ================================================================______________________________ [doctest] third_party.bigframes_vendored.pandas.core.generic.NDFrame.sample _______________________________558             dog            4          0                  2559             spider         8          0                  1560             fish           0          0                  8561             <BLANKLINE>562             [4 rows x 3 columns]563 564         Fetch one random row from the DataFrame (Note that we use `random_state`565         to ensure reproducibility of the examples):566 567             >>> df.sample(random_state=1)UNEXPECTED EXCEPTION: NotImplementedError("Polars compiler hasn't implemented hash()")Traceback (most recent call last):  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/doctest.py", line 1350, in __run    exec(compile(example.source, filename, "single",  File "<doctest third_party.bigframes_vendored.pandas.core.generic.NDFrame.sample[2]>", line 1, in <module>  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/log_adapter.py", line 197, in wrapper    raise e  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/log_adapter.py", line 182, in wrapper    return method(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/dataframe.py", line 794, in __repr__    pandas_df, row_count, query_job = self._block.retrieve_repr_request_results(  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/blocks.py", line 1658, in retrieve_repr_request_results    head_result = self.session._executor.execute(  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/testing/polars_session.py", line 48, in execute    lazy_frame: polars.LazyFrame = self.compiler.compile(array_value.node)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 577, in compile    return self.compile_node(node)  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection    return self.compile_node(node.child).select(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 601, in compile_filter    return self.compile_node(node.child).filter(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 649, in compile_offsets    return self.compile_node(node.child).with_columns(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection    return self.compile_node(node.child).select(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 601, in compile_filter    return self.compile_node(node.child).filter(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 649, in compile_offsets    return self.compile_node(node.child).with_columns(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection    return self.compile_node(node.child).select(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 607, in compile_orderby    frame = self.compile_node(node.child)  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection    return self.compile_node(node.child).select(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 639, in compile_projection    new_col = self.expr_compiler.compile_expression(bound_expr).alias(name.sql)  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 180, in _    return self.compile_op(op, *args)  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 184, in compile_op    raise NotImplementedError(f"Polars compiler hasn't implemented {op}")NotImplementedError: Polars compiler hasn't implemented hash()/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/third_party/bigframes_vendored/pandas/core/generic.py:567: UnexpectedException

Aside: As much as possible I'd like to encourage us BigFrames devs to implement our ops in the Polars session as well as BQ, so defaulting to Polars is a subtle nudge in that direction.

Comment on lines 221 to 223
)->Union[pandas.Timestamp,datetime.datetime,bigframes.series.Series]:
returnglobal_session.with_default_session(
bigframes.session.Session.to_datetime,
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

to_datetime() has code paths that take local data. It was using the global session implicitly when it constructed the Series objects. Now it can take a session explicitly.

Comment on lines 2331 to 2333
MultiIndex.from_tuples=bigframes.core.indexes.MultiIndex.from_tuples# type: ignore
MultiIndex.from_frame=bigframes.core.indexes.MultiIndex.from_frame# type: ignore
MultiIndex.from_arrays=bigframes.core.indexes.MultiIndex.from_arrays# type: ignore
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

TODO: these should probably take a Session argument, too.

doctest_namespace["np"]=np
doctest_namespace["pd"]=pd
doctest_namespace["pa"]=pa
doctest_namespace["bpd"]=polars_session
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Unfortunately, there's quite a bit that isn't supported yet on the Polars session. Doing it this way means that we can overridebpd to be the BQ version in the samples itself with a simple import.

**Examples:**
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is how I made some samples use the BQ session instead of the Polars session. In this case,hash() is unimplemented:

third_party/bigframes_vendored/pandas/core/generic.py ..........F.                                                                 [100%]================================================================ FAILURES ================================================================______________________________ [doctest] third_party.bigframes_vendored.pandas.core.generic.NDFrame.sample _______________________________558             dog            4          0                  2559             spider         8          0                  1560             fish           0          0                  8561             <BLANKLINE>562             [4 rows x 3 columns]563 564         Fetch one random row from the DataFrame (Note that we use `random_state`565         to ensure reproducibility of the examples):566 567             >>> df.sample(random_state=1)UNEXPECTED EXCEPTION: NotImplementedError("Polars compiler hasn't implemented hash()")Traceback (most recent call last):  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/doctest.py", line 1350, in __run    exec(compile(example.source, filename, "single",  File "<doctest third_party.bigframes_vendored.pandas.core.generic.NDFrame.sample[2]>", line 1, in <module>  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/log_adapter.py", line 197, in wrapper    raise e  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/log_adapter.py", line 182, in wrapper    return method(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/dataframe.py", line 794, in __repr__    pandas_df, row_count, query_job = self._block.retrieve_repr_request_results(  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/blocks.py", line 1658, in retrieve_repr_request_results    head_result = self.session._executor.execute(  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/testing/polars_session.py", line 48, in execute    lazy_frame: polars.LazyFrame = self.compiler.compile(array_value.node)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 577, in compile    return self.compile_node(node)  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection    return self.compile_node(node.child).select(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 601, in compile_filter    return self.compile_node(node.child).filter(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 649, in compile_offsets    return self.compile_node(node.child).with_columns(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection    return self.compile_node(node.child).select(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 601, in compile_filter    return self.compile_node(node.child).filter(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 649, in compile_offsets    return self.compile_node(node.child).with_columns(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection    return self.compile_node(node.child).select(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 607, in compile_orderby    frame = self.compile_node(node.child)  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection    return self.compile_node(node.child).select(  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 639, in compile_projection    new_col = self.expr_compiler.compile_expression(bound_expr).alias(name.sql)  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 180, in _    return self.compile_op(op, *args)  File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method    return method.__get__(obj, cls)(*args, **kwargs)  File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 184, in compile_op    raise NotImplementedError(f"Polars compiler hasn't implemented {op}")NotImplementedError: Polars compiler hasn't implemented hash()/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/third_party/bigframes_vendored/pandas/core/generic.py:567: UnexpectedException

Aside: As much as possible I'd like to encourage us BigFrames devs to implement our ops in the Polars session as well as BQ, so defaulting to Polars is a subtle nudge in that direction.

@tswasttswast mentioned this pull requestOct 8, 2025
4 tasks
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I recommend reviewing this with "hide whitespace" turned on. The only change is to increase indentation to fix samples tests that break due to an attempted import of this file without Polars installed.

>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Simple vectorized functions, lambdas or ufuncs can be applied directly
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Note: this is rearranged so that the remote functions samples come second.

Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This file is where the "magic" happens (an auto use test fixture).

@tswast
Copy link
CollaboratorAuthor

tswast commentedOct 16, 2025
edited
Loading

Failures:

FAILED tests/system/small/test_null_index.py::test_null_index_series_repr - A...FAILED tests/system/small/test_null_index.py::test_null_index_dataframe_repr

Looks like we're setting therepr_mode in our tests somewhere without using anoption_context. I'll take a look.

Edit: Mailed#2175

TrevorBergeron
TrevorBergeron previously approved these changesOct 16, 2025
@tswast
Copy link
CollaboratorAuthor

presubmit looks like a flake withremote_function. I'll sync to main and try again

@tswasttswast merged commit1a01ab9 intomainOct 16, 2025
19 of 25 checks passed
@tswasttswast deleted the tswast-doctest-boilerplate branchOctober 16, 2025 21:33
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@TrevorBergeronTrevorBergeronTrevorBergeron approved these changes

@kweinmeisterkweinmeisterAwaiting requested review from kweinmeisterkweinmeister is a code owner automatically assigned from googleapis/python-samples-reviewers

Assignees

@sycaisycai

Labels

api: bigqueryIssues related to the googleapis/python-bigquery-dataframes API.samplesIssues that are directly related to samples.size: lPull request size is large.

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@tswast@TrevorBergeron@sycai

[8]ページ先頭

©2009-2025 Movatter.jp