Use Serverless for Apache Spark with managed notebooks
Vertex AI Workbench managed notebooks isdeprecated. On April 14, 2025, support for managed notebooks ended and the ability to create managed notebooks instances was removed. Existing instances will continue to function until March 30, 2026, but patches, updates, and upgrades won't be available. To continue using Vertex AI Workbench, we recommend that youmigrate your managed notebooks instances to Vertex AI Workbench instances.
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
This page shows you how to run a notebook file on serverless Sparkin a Vertex AI Workbench managed notebooks instanceby usingGoogle Cloud Serverless for Apache Spark.
Your managed notebooks instancecan submit a notebook file's code to run onthe Serverless for Apache Spark service. The service runsthe code on a managed compute infrastructure that automaticallyscales resources as needed. Therefore,you don't need to provision and manage your own cluster.
Serverless for Apache Spark chargesapply only to the time when the workload is executing.
Requirements
To run a notebook file on Serverless for Apache Spark,see the following requirements.
Your Serverless for Apache Spark session must run in the sameregion as your managed notebooks instance.
The Require OS Login (
constraints/compute.requireOsLogin) constraintmust not be enabled for your project. SeeManage OS Login inan organization.To run a notebook file on Serverless for Apache Spark,you must provide aservice accountthat has specific permissions. You can grant these permissionsto the default service account or provide a custom service account.See thePermissions section of this page.
Your Serverless for Apache Spark session usesa Virtual Private Cloud (VPC) network to execute workloads.The VPC subnetwork must meet specific requirements.See the requirements inGoogle Cloud Serverless for Apache Spark forSpark network configuration.
Permissions
To ensure that the service account has the necessary permissions to run a notebook file on Serverless for Apache Spark, ask your administrator to grant the service account theDataproc Editor (roles/dataproc.editor) IAM role on your project.Important: You must grant this role to the service account,not to your user account. Failure to grant the role to the correct principal might result in permission errors. For more information about granting roles, seeManage access to projects, folders, and organizations.
This predefined role contains the permissions required to run a notebook file on Serverless for Apache Spark. To see the exact permissions that are required, expand theRequired permissions section:
Required permissions
The following permissions are required to run a notebook file on Serverless for Apache Spark:
dataproc.agents.createdataproc.agents.deletedataproc.agents.getdataproc.agents.updatedataproc.session.createdataproc.sessions.getdataproc.sessions.listdataproc.sessions.terminatedataproc.sessions.deletedataproc.tasks.leasedataproc.tasks.listInvalidatedLeasesdataproc.tasks.reportStatus
Your administrator might also be able to give the service account these permissions withcustom roles or otherpredefined roles.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
Enable the Notebooks, Vertex AI, and Dataproc APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission.Learn how to grant roles.In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
Enable the Notebooks, Vertex AI, and Dataproc APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission.Learn how to grant roles.- If you haven't already, configure a VPC network that meets the requirements listed inGoogle Cloud Serverless for Apache Spark network configuration.
Open JupyterLab
In the Google Cloud console, go to theManaged notebooks page.
Next to your managed notebooks instance's name,clickOpen JupyterLab.
Start a Serverless for Apache Spark session
To start a Serverless for Apache Spark session,complete the following steps.
In your managed notebooks instance's JupyterLab interface,select theLauncher tab, and then selectServerless Spark.If theLauncher tab is not open,selectFile > New Launcher to open it.
TheCreate Serverless Spark session dialog appears.
In theSession name field, enter a name for your session.
In theExecution configuration section, entertheService account that you want to use. If you don't entera service account, your session will use theCompute Engine defaultservice account.
In theNetwork configuration section, select theNetwork andSubnetwork of a network that meets the requirementslisted inGoogle Cloud Serverless for Apache Sparknetwork configuration.
ClickCreate.
A new notebook file opens.The Serverless for Apache Spark session that you created isthe kernel that runs your notebook file's code.
Run your code on Serverless for Apache Spark and other kernels
Add code to your new notebook file, and run the code.
To run code on a different kernel,change the kernel.
When you want to run the code onyour Serverless for Apache Spark session again,change the kernel back tothe Serverless for Apache Spark kernel.
Terminate your Serverless for Apache Spark session
You can terminate a Serverless for Apache Spark sessionin the JupyterLab interface or in the Google Cloud console.The code in your notebook file is preserved.
JupyterLab
In JupyterLab, close the notebook file that was created when youcreated your Serverless for Apache Spark session.
In the dialog that appears, clickTerminate session.
Google Cloud console
In the Google Cloud console, go to theDataproc sessions page.
Select the session that you want to terminate,and then clickTerminate.
Delete your Serverless for Apache Spark session
You can delete a Serverless for Apache Spark sessionby using the Google Cloud console.The code in your notebook file is preserved.
In the Google Cloud console, go to theDataproc sessions page.
Select the session that you want to delete,and then clickDelete.
What's next
- Learn more aboutServerless for Apache Spark.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.