This branch introduces significant enhancements to the interactive table widget, particularly for displaying complex data structures like nested STRUCTs and ARRAYs, which implicitly improves the visual handling of multi-index DataFrames.

Improved Nested Data Display (html.py)
Refined Multi-Index and Row Grouping (anywidget.py,table_widget.js,table_widget.css)
General UI/UX Improvements

TODO(b/469861913): Nested columns from structs (e.g., 'struct_col.name') are not currently sortable

See an example here: screen/48TZkxRbmuWNYBB

Fixes #<459515995> 🦕

shuoweil requested a review fromtswast

November 10, 2025 22:11

shuoweil self-assigned this

Nov 10, 2025

shuoweil requested review froma team ascode owners

November 10, 2025 22:11

Copy link

review-notebook-appbot commentedNov 10, 2025

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered byReviewNB

product-auto-labelbot added size: l

Pull request size is large.

api: bigqueryIssues related to the googleapis/python-bigquery-dataframes API. labels

Nov 10, 2025

shuoweil changed the title~~feat: Enhance anywidget notebook with Index and MultiIndex example and improve test robustness~~feat: Display index and mutiindex in anywidget mode

Nov 10, 2025

shuoweil marked this pull request as draft

November 11, 2025 19:13

shuoweil added the status: not ready for review label

Nov 11, 2025

shuoweil removed the request for review fromtswast

November 11, 2025 19:13

shuoweil force-pushed theshuowei-anywidget-index-testcase branch from441bb94 to33fd6e3Compare

November 11, 2025 20:30

shuoweil removed the status: not ready for review label

Nov 11, 2025

shuoweil requested a review fromtswast

November 11, 2025 21:41

shuoweil marked this pull request as ready for review

November 11, 2025 21:42

shuoweil changed the title~~feat: Display index and mutiindex in anywidget mode~~feat: Display index and multi-index in anywidget mode

Nov 12, 2025

tswast reviewed

Nov 12, 2025

View reviewed changes

bigframes/display/html.py Outdated

Comment on lines 32 to 80

		def_calculate_rowspans(dataframe:pd.DataFrame)->list[list[int]]:
		"""Calculates the rowspan for each cell in a MultiIndex DataFrame.

		Args:
		dataframe (pd.DataFrame):
		The DataFrame for which to calculate index rowspans.

		Returns:
		list[list[int]]:
		A list of lists, where each inner list corresponds to an index level
		and contains the rowspan for each row at that level. A value of 0
		indicates that the cell should not be rendered (it's covered by a
		previous rowspan).
		"""
		ifnotisinstance(dataframe.index,pd.MultiIndex):
		# If not a MultiIndex, no rowspans are needed for the index itself.
		# Return a structure that indicates each index cell should be rendered once.
		return [[1]*len(dataframe.index)]ifdataframe.index.nlevels>0else []

		rowspans:list[list[int]]= []
		forlevel_idxinrange(dataframe.index.nlevels):
		current_level_spans:list[int]= []
		current_value=None
		current_span=0

		foriinrange(len(dataframe.index)):
		value=dataframe.index.get_level_values(level_idx)[i]

		ifvalue==current_value:
		current_span+=1
		current_level_spans.append(0)# Mark as covered by previous rowspan
		else:
		# If new value, finalize previous span and start a new one
		ifcurrent_span>0:
		# Update the rowspan for the start of the previous span
		current_level_spans[i-current_span]=current_span
		current_value=value
		current_span=1
		current_level_spans.append(0)# Placeholder, will be updated later

		# Finalize the last span
		ifcurrent_span>0:
		current_level_spans[len(dataframe.index)-current_span]=current_span

		rowspans.append(current_level_spans)

		returnrowspans

Copy link

Collaborator

tswastNov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

While this makes for a somewhat cleaner view, I think we'd be better off not implementing rowpans. For example sorting by the various index columns is less intuitive if the index isn't rendered the same as other columns.

Copy link

ContributorAuthor

shuoweilDec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A new implementation is used without rowspans.

bigframes/display/html.py Outdated

		table_html.append(' <tr style="text-align: left;">')

		# Add index headers
		fornameindataframe.index.names:

Copy link

Collaborator

tswastNov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Let's try to actually refer to the original BigFrames DataFrame when deciding which index columns to render or not. In particular, we should not render the index if the BigFrame DataFrame has a NULL index.

Copy link

ContributorAuthor

shuoweilDec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

My new version correctly refers to the original BigFrames DataFrame's _block.has_index property. If has_index is False (indicating a "NULL index"), the _prepare_dataframe_for_display method ensures that a default "Row" column is inserted for display instead of attempting to render a non-existent index. My recent refactoring preserved this established logic.

shuoweil marked this pull request as draft

December 5, 2025 18:53

product-auto-labelbot added size: xlPull request size is extra large. and removed size: lPull request size is large. labels

Dec 17, 2025

shuoweil changed the title~~feat: Display index and multi-index in anywidget mode~~feat: Display custom multi-index in anywidget mode

Dec 17, 2025

shuoweil requested a review fromtswast

December 18, 2025 19:38

shuoweil marked this pull request as ready for review

December 19, 2025 02:40

tswast requested changes

Dec 19, 2025

View reviewed changes

bigframes/display/anywidget.py Outdated

		orderable_columns=traitlets.List(traitlets.Unicode(), []).tag(sync=True)
		_initial_load_complete=traitlets.Bool(False).tag(sync=True)
		_batches:Optional[blocks.PandasBatches]=None
		_batches:blocks.PandasBatches\|None=None

Copy link

Collaborator

tswastDec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This syntax was added in Python 3.10.https://peps.python.org/pep-0604/

Currently, we are supporting Python 3.9

python-bigquery-dataframes/setup.py

Line 155 in3c54c68

python_requires=">=3.9",

but since this version is end-of-life, we should probably drop it sooner rather than later.

Please wait to use this syntax until we drop Python 3.9. I have filed b/470438395 to track this.

bigframes/display/anywidget.py Outdated

Comment on lines 269 to 270

		ifhasattr(self,"_setting_html"):
		return

Copy link

Collaborator

tswastDec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'd prefer we initialize this to false ininit instead of relying on hasattr.

Copy link

Collaborator

tswastDec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Also, consider multi threading. We probably need a lock to fully prevent two table htmls being set at the same time.

Copy link

ContributorAuthor

shuoweilDec 19, 2025•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Here is what I change to: (1) self._setting_html = False and self._setting_html_lock = threading.Lock() were added to the TableWidget.init method
(2) The _set_table_html method was refactored to use with self._setting_html_lock: and the self._setting_html flag to prevent race conditions during HTML rendering. The flag is reset in the finally block to ensure it's always released.

bigframes/display/anywidget.py

		# TODO(b/463715504): Support sorting by index columns.
		df_to_display=df_to_display.sort_values(
		by=self.sort_column,ascending=self.sort_ascending
		try:

Copy link

Collaborator

tswastDec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I would prefer a context manager instead of a try/except/finally block. That way we can be more explicit about what the intention is. Are you sure you want to competely skip setting the html if another thread tries to set it? If not, just do a with block with a lock here. If so, please explain why it's safe to skip.

Aside: I would actually expect the opposite here, where later set calls would interrupt any set call that is currently running.

Copy link

ContributorAuthor

shuoweilDec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A threading.Lock was used as a context manager (with self._setting_html_lock:) within _set_table_html. The logic now explicitly uses the _setting_html flag within this locked context to control whether HTML rendering proceeds or is skipped, ensuring thread safety and clear intent.

bigframes/display/anywidget.py

		def_get_page_data(self)->pd.DataFrame:
		"""Get the data for the current page, handling unknown row count."""
		# This loop is to handle auto-correction of page number when row count is unknown
		whileTrue:

Copy link

Collaborator

tswastDec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This looks like it's intended for retries, but this is a very unsafe way to do that, if is the intention.

Also, I don't understand how this change relates to adding columns for mutiindex.

Copy link

ContributorAuthor

shuoweilDec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

No, it is not related to multiindex display. I mix so many features into the same pr. I have refactor them out. And move this one to a separate new branch.