- Notifications
You must be signed in to change notification settings - Fork63
docs: remove import bigframes.pandas as bpd boilerplate from many samples#2147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…plesAlso, fixes several constructors that didn't take a session forcompatibility with multi-session applications.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
bigframes/conftest.py Outdated
| doctest_namespace["np"]=np | ||
| doctest_namespace["pd"]=pd | ||
| doctest_namespace["pa"]=pa | ||
| doctest_namespace["bpd"]=polars_session |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I wonder if instead, we should just inject the polars session as global session? Not sure all the methods are the same, but I guess it works so far?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Unfortunately, there's quite a bit that isn't supported yet on the Polars session. Doing it this way means that we can overridebpd to be the BQ version in the samples itself with a simple import.
| # These are included so that Session and bigframes.pandas can be used | ||
| # interchangeably. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
If this is purely for doctests, or can we just inject session for doctests? or are we trying to enable some other stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
IMO, it's important for consistency. I actually uncovered a few cases where the session should have been supplied to things liketo_datetime() with local data but wasn't.
| )->Union[pandas.Timestamp,datetime.datetime,bigframes.series.Series]: | ||
| returnglobal_session.with_default_session( | ||
| bigframes.session.Session.to_datetime, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Not sure I understand this change/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
to_datetime() has code paths that take local data. It was using the global session implicitly when it constructed the Series objects. Now it can take a session explicitly.
| **Examples:** | ||
| >>> import bigframes.pandas as bpd | ||
| >>> bpd.options.display.progress_bar = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
why do we still need some of the bpd imports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is how I made some samples use the BQ session instead of the Polars session. In this case,hash() is unimplemented:
third_party/bigframes_vendored/pandas/core/generic.py ..........F. [100%]================================================================ FAILURES ================================================================______________________________ [doctest] third_party.bigframes_vendored.pandas.core.generic.NDFrame.sample _______________________________558 dog 4 0 2559 spider 8 0 1560 fish 0 0 8561 <BLANKLINE>562 [4 rows x 3 columns]563 564 Fetch one random row from the DataFrame (Note that we use `random_state`565 to ensure reproducibility of the examples):566 567 >>> df.sample(random_state=1)UNEXPECTED EXCEPTION: NotImplementedError("Polars compiler hasn't implemented hash()")Traceback (most recent call last): File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/doctest.py", line 1350, in __run exec(compile(example.source, filename, "single", File "<doctest third_party.bigframes_vendored.pandas.core.generic.NDFrame.sample[2]>", line 1, in <module> File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/log_adapter.py", line 197, in wrapper raise e File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/log_adapter.py", line 182, in wrapper return method(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/dataframe.py", line 794, in __repr__ pandas_df, row_count, query_job = self._block.retrieve_repr_request_results( File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/blocks.py", line 1658, in retrieve_repr_request_results head_result = self.session._executor.execute( File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/testing/polars_session.py", line 48, in execute lazy_frame: polars.LazyFrame = self.compiler.compile(array_value.node) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 577, in compile return self.compile_node(node) File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection return self.compile_node(node.child).select( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 601, in compile_filter return self.compile_node(node.child).filter( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 649, in compile_offsets return self.compile_node(node.child).with_columns( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection return self.compile_node(node.child).select( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 601, in compile_filter return self.compile_node(node.child).filter( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 649, in compile_offsets return self.compile_node(node.child).with_columns( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection return self.compile_node(node.child).select( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 607, in compile_orderby frame = self.compile_node(node.child) File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection return self.compile_node(node.child).select( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 639, in compile_projection new_col = self.expr_compiler.compile_expression(bound_expr).alias(name.sql) File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 180, in _ return self.compile_op(op, *args) File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 184, in compile_op raise NotImplementedError(f"Polars compiler hasn't implemented {op}")NotImplementedError: Polars compiler hasn't implemented hash()/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/third_party/bigframes_vendored/pandas/core/generic.py:567: UnexpectedExceptionAside: As much as possible I'd like to encourage us BigFrames devs to implement our ops in the Polars session as well as BQ, so defaulting to Polars is a subtle nudge in that direction.
| )->Union[pandas.Timestamp,datetime.datetime,bigframes.series.Series]: | ||
| returnglobal_session.with_default_session( | ||
| bigframes.session.Session.to_datetime, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
to_datetime() has code paths that take local data. It was using the global session implicitly when it constructed the Series objects. Now it can take a session explicitly.
bigframes/session/__init__.py Outdated
| MultiIndex.from_tuples=bigframes.core.indexes.MultiIndex.from_tuples# type: ignore | ||
| MultiIndex.from_frame=bigframes.core.indexes.MultiIndex.from_frame# type: ignore | ||
| MultiIndex.from_arrays=bigframes.core.indexes.MultiIndex.from_arrays# type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
TODO: these should probably take a Session argument, too.
bigframes/conftest.py Outdated
| doctest_namespace["np"]=np | ||
| doctest_namespace["pd"]=pd | ||
| doctest_namespace["pa"]=pa | ||
| doctest_namespace["bpd"]=polars_session |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Unfortunately, there's quite a bit that isn't supported yet on the Polars session. Doing it this way means that we can overridebpd to be the BQ version in the samples itself with a simple import.
| **Examples:** | ||
| >>> import bigframes.pandas as bpd | ||
| >>> bpd.options.display.progress_bar = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is how I made some samples use the BQ session instead of the Polars session. In this case,hash() is unimplemented:
third_party/bigframes_vendored/pandas/core/generic.py ..........F. [100%]================================================================ FAILURES ================================================================______________________________ [doctest] third_party.bigframes_vendored.pandas.core.generic.NDFrame.sample _______________________________558 dog 4 0 2559 spider 8 0 1560 fish 0 0 8561 <BLANKLINE>562 [4 rows x 3 columns]563 564 Fetch one random row from the DataFrame (Note that we use `random_state`565 to ensure reproducibility of the examples):566 567 >>> df.sample(random_state=1)UNEXPECTED EXCEPTION: NotImplementedError("Polars compiler hasn't implemented hash()")Traceback (most recent call last): File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/doctest.py", line 1350, in __run exec(compile(example.source, filename, "single", File "<doctest third_party.bigframes_vendored.pandas.core.generic.NDFrame.sample[2]>", line 1, in <module> File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/log_adapter.py", line 197, in wrapper raise e File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/log_adapter.py", line 182, in wrapper return method(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/dataframe.py", line 794, in __repr__ pandas_df, row_count, query_job = self._block.retrieve_repr_request_results( File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/blocks.py", line 1658, in retrieve_repr_request_results head_result = self.session._executor.execute( File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/testing/polars_session.py", line 48, in execute lazy_frame: polars.LazyFrame = self.compiler.compile(array_value.node) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 577, in compile return self.compile_node(node) File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection return self.compile_node(node.child).select( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 601, in compile_filter return self.compile_node(node.child).filter( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 649, in compile_offsets return self.compile_node(node.child).with_columns( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection return self.compile_node(node.child).select( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 601, in compile_filter return self.compile_node(node.child).filter( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 649, in compile_offsets return self.compile_node(node.child).with_columns( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection return self.compile_node(node.child).select( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 607, in compile_orderby frame = self.compile_node(node.child) File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 630, in compile_selection return self.compile_node(node.child).select( File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 639, in compile_projection new_col = self.expr_compiler.compile_expression(bound_expr).alias(name.sql) File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 180, in _ return self.compile_op(op, *args) File "/usr/local/google/home/swast/.pyenv/versions/3.10.16/lib/python3.10/functools.py", line 926, in _method return method.__get__(obj, cls)(*args, **kwargs) File "/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/bigframes/core/compile/polars/compiler.py", line 184, in compile_op raise NotImplementedError(f"Polars compiler hasn't implemented {op}")NotImplementedError: Polars compiler hasn't implemented hash()/usr/local/google/home/swast/src/github.com/googleapis/python-bigquery-dataframes/third_party/bigframes_vendored/pandas/core/generic.py:567: UnexpectedExceptionAside: As much as possible I'd like to encourage us BigFrames devs to implement our ops in the Polars session as well as BQ, so defaulting to Polars is a subtle nudge in that direction.
… tswast-doctest-boilerplate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I recommend reviewing this with "hide whitespace" turned on. The only change is to increase indentation to fix samples tests that break due to an attempted import of this file without Polars installed.
tests/unit/core/compile/sqlglot/expressions/snapshots/test_numeric_ops/test_sqrt/out.sql OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
…eric_ops/test_sqrt/out.sql
Uh oh!
There was an error while loading.Please reload this page.
| >>> import bigframes.pandas as bpd | ||
| >>> bpd.options.display.progress_bar = None | ||
| Simple vectorized functions, lambdas or ufuncs can be applied directly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Note: this is rearranged so that the remote functions samples come second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This file is where the "magic" happens (an auto use test fixture).
tswast commentedOct 16, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Failures: Looks like we're setting the Edit: Mailed#2175 |
tswast commentedOct 16, 2025
presubmit looks like a flake with |
1a01ab9 intomainUh oh!
There was an error while loading.Please reload this page.
Also, fixes several constructors that didn't take a session for compatibility with multi-session applications.
🦕