bigframes.pandas.DataFrame.to_pandas_batches#

DataFrame.to_pandas_batches(page_size:int|None=None,max_results:int|None=None,*,allow_large_results:bool|None=None)Iterable[DataFrame][source]#

Stream DataFrame results to an iterable of pandas DataFrame.

page_size and max_results determine the size and number of batches,seehttps://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.job.QueryJob#google_cloud_bigquery_job_QueryJob_result

Examples:

>>>df=bpd.DataFrame({'col':[4,3,2,2,3]})

Iterate through the results in batches, limiting the total rows yieldedacross all batches viamax_results:

>>>fordf_batchindf.to_pandas_batches(max_results=3):...print(df_batch)   col0    41    32    2

Alternatively, control the approximate size of each batch usingpage_sizeand fetch batches manually usingnext():

>>>it=df.to_pandas_batches(page_size=2)>>>next(it)   col0    41    3>>>next(it)   col2    23    2
Parameters:
  • page_size (int,default None) – The maximum number of rows of each batch. Non-positive values are ignored.

  • max_results (int,default None) – The maximum total number of rows of all batches.

  • allow_large_results (bool,default None) – If not None, overrides the global setting to allow or disallow large query resultsover the default size limit of 10 GB.

Returns:

An iterable of smaller dataframes which combine toform the original dataframe. Results stream from bigquery,seehttps://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.table.RowIterator#google_cloud_bigquery_table_RowIterator_to_arrow_iterable

Return type:

Iterable[pandas.DataFrame]