Create a regression model with BigQuery DataFrames Stay organized with collections Save and categorize content based on your preferences.
Create a linear regression model on the body mass of penguins using the BigQuery DataFrames API.
Explore further
For detailed documentation that includes this code sample, see the following:
Code sample
Python
Before trying this sample, follow thePython setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPython API reference documentation.
To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.
frombigframes.ml.linear_modelimportLinearRegressionimportbigframes.pandasasbpd# Load data from BigQueryquery_or_table="bigquery-public-data.ml_datasets.penguins"bq_df=bpd.read_gbq(query_or_table)# Filter down to the data to the Adelie Penguin speciesadelie_data=bq_df[bq_df.species=="Adelie Penguin (Pygoscelis adeliae)"]# Drop the species columnadelie_data=adelie_data.drop(columns=["species"])# Drop rows with nulls to get training datatraining_data=adelie_data.dropna()# Specify your feature (or input) columns and the label (or output) column:feature_columns=training_data[["island","culmen_length_mm","culmen_depth_mm","flipper_length_mm","sex"]]label_columns=training_data[["body_mass_g"]]test_data=adelie_data[adelie_data.body_mass_g.isnull()]# Create the linear modelmodel=LinearRegression()model.fit(feature_columns,label_columns)# Score the modelscore=model.score(feature_columns,label_columns)# Predict using the modelresult=model.predict(test_data)What's next
To search and filter code samples for other Google Cloud products, see theGoogle Cloud sample browser.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.