2.3.x release versions

Component2.3.23-debian12/
-ubuntu22/
-ubuntu22-arm/
-ml-ubuntu22/
-rocky9
2026/02/15
2.3.22-debian12/
-ubuntu22/
-ubuntu22-arm/
-ml-ubuntu22/
-rocky9
2026/02/05
2.3.21-debian12/
-ubuntu22/
-ubuntu22-arm/
-ml-ubuntu22/
-rocky9
2026/01/24
2.3.20-debian12/
-ubuntu22/
-ubuntu22-arm/
-ml-ubuntu22/
-rocky9
2026/01/06
2.3.19-debian12/
-ubuntu22/
-ubuntu22-arm/
-ml-ubuntu22/
-rocky9
2025/12/05
Apache Atlas
initialization action
2.2.02.2.02.2.02.2.02.2.0
Apache Flink
optional component
1.17.01.17.01.17.01.17.01.17.0
Apache Hadoop
installed
3.3.63.3.63.3.63.3.63.3.6
Apache Hive
installed
3.1.33.1.33.1.33.1.33.1.3
Apache Hive WebHCat
optional component
3.1.33.1.33.1.33.1.33.1.3
Apache Hudi
optional component
0.15.00.15.00.15.00.15.00.15.0
Apache Iceberg
optional component
1.6.11.6.11.6.11.6.11.6.1
Apache Kafka
initialization action
3.1.03.1.03.1.03.1.03.1.0
Apache Pig
optional component
0.18.0-SNAPSHOT0.18.0-SNAPSHOT0.18.0-SNAPSHOT0.18.0-SNAPSHOT0.18.0-SNAPSHOT
Apache Spark
installed
3.5.33.5.33.5.33.5.33.5.3
Apache Sqoop
initialization action
1.5.0-SNAPSHOT1.5.0-SNAPSHOT1.5.0-SNAPSHOT1.5.0-SNAPSHOT1.5.0-SNAPSHOT
Apache Tez
installed
0.10.20.10.20.10.20.10.20.10.2
BigQuery Connector
installed
0.42.30.42.30.42.30.42.30.42.3
Cloud Storage Connector
installed
3.1.103.1.103.1.93.1.93.1.6
Conscrypt
installed
2.5.22.5.22.5.22.5.22.5.2
Delta Lake
optional component
3.2.13.2.03.2.03.2.03.2.0
Docker
optional component
28.128.128.128.128.1
Hue
initialization action
4.11.04.11.04.11.04.11.04.11.0
Java
installed
1111111111
JupyterLab Notebook
optional component
3.63.63.63.63.6
Oozie
initialization action
5.2.15.2.15.2.15.2.15.2.1
Python
installed
micromamba 2.0.5 withPython 3.11micromamba 2.0.5 withPython 3.11micromamba 2.0.5 withPython 3.11micromamba 2.0.5 withPython 3.11micromamba 2.0.5 withPython 3.11
R
installed
R 4.3R 4.3R 4.3R 4.3R 4.3
Ranger
optional component
2.4.02.4.02.4.02.4.02.4.0
Scala
installed
2.12.182.12.182.12.182.12.182.12.18
Solr
optional component
9.4.19.4.19.4.19.4.19.4.1
Trino
optional component
432432432432432
Zeppelin Notebook
optional component
0.10.10.10.10.10.10.10.10.10.1
Zookeeper
optional component
3.9.33.9.33.9.33.9.33.9.3

Notes:

Notes

  • The following optional components are supported in non-arm 2.3 images:

    • Apache Flink
    • Apache Hive WebHCat
    • Apache Hudi
    • Apache Iceberg
    • Apache Pig
    • Delta Lake
    • Docker
    • JupyterLab Notebook
    • Ranger
    • Solr
    • Trino
    • Zeppelin notebook
    • Zookeeper
  • 2.3.x-*-arm imagessupport only the pre-installed components and the following optionalcomponents. The other 2.3 optional components and all initialization actionsaren't supported:

    • Apache Hive WebHCat
    • Apache Pig (starting with2.3.22-ubuntu22-arm)
    • Docker
    • Zeppelin notebook
    • Zookeeper (installed inhigh availability clusters;optional component in other clusters)
  • yarn.nodemanager.recovery.enabled and HDFS Audit Loggingare enabled by default in 2.3 images.

  • micromamba, instead ofconda in previous image versions, is installed as partof the Python installation.

  • Docker and Zeppelin installation issues:

    • Installation fails if the cluster has no public internet access. As aworkaround, create a cluster that uses a custom image with optionalcomponents pre-installed. You can do this by runninggenerate_custom_image.pywith the--optional-components flag.
    • Installation can fail if the cluster is pinned to an older sub-minor imageversion: Packages are installed on demand from public OSS repositories, and a packagemight not be available upstream to support the installation.As a workaround, create a cluster that uses a custom image with optionalcomponents pre-installed in the custom image. To do this, rungenerate_custom_image.pywith the--optional-components flag.
  • The default resource calculator for YARN has been changed fromDefaultResourceCalculatortoDominantResourceCalculator,which uses the dominant-resource concept to determine resource allocation,such as Memory and CPU allocation. This change impactsAutoscaler,which scales based on the dominant resource usage of the cluster.

Image version 2.3 machine learning (ML) components

The Dataproc2.3-ml-ubuntu image extends the 2.3 base imagewith ML-specific software. It supports 2.3 image optional components and other2.3 features, and adds the component versions listed in the following sections.

GPU-specific libraries

For Dataproc jobs that use GPU VMs,the following NVIDIA driver and libraries are available in the2.3-ml-ubuntu image. You can use them to accomplish the followingtasks:

  • Accelerate Spark batch workloads with theNVIDIA Spark Rapids library
  • Train machine learning workloads
  • Run distributed batch inference using Spark
Package NameVersion
Spark Rapids25.04.0
NVIDIA DriverUbuntu 22.04 LTS Accelerated with NVIDIA driver version 570
CUDA12.6.3
cublas12.6.4
cusolver11.7.1
cupti12.6.80
cusparse12.5.4
cuDNN9.10.1
NCCL2.27.5

XGBoost libraries

The followingMaven package versionsare available in2.3-ml-ubuntu image to let you useXGBoost with Spark in Java orScala.

Group IDPackage NameVersion
ml.dmlcxgboost4j-gpu_2.122.1.1
ml.dmlcxgboost4j-spark-gpu_2.122.1.1
Note: You cannot use distributed Spark XGBoost on a Dataproc job that hasautoscaling(the default behavior) because new nodes that start elastic scaling cannot receive new tasks and remainidle. To use XGBoost with a batch workload, you can set thespark.dynamicAllocation.enabled = falseproperty on a Dataproc job to disable dynamic allocation.

Python libraries

The2.3-ml-ubuntu image contains the following libraries, which support differentstages in the ML lifecycle.

`2.3-ml-ubuntu` image Python libraries
PackageVersion
accelerate1.8.1
conda23.11.0
cookiecutter2.5.0
curl8.12.1
cython3.0.12
dask2023.12.1
datasets3.6.0
deepspeed0.17.2
delta-spark3.2.0
evaluate0.4.5
fastavro1.9.7
fastparquet2023.10.1
fiona1.10.0
gateway-provisioners[yarn]0.4.0
gcsfs2023.12.2.post1
google-auth-oauthlib1.2.2
google-cloud-aiplatform1.88.0
google-cloud-bigquery[pandas]3.31.0
google-cloud-bigquery-storage2.30.0
google-cloud-bigtable2.30.1
google-cloud-container2.56.1
google-cloud-datacatalog3.26.1
google-cloud-dataproc5.18.1
google-cloud-datastore2.21.0
google-cloud-language2.17.2
google-cloud-logging3.11.4
google-cloud-monitoring2.27.2
google-cloud-pubsub2.29.1
google-cloud-redis2.18.1
google-cloud-spanner3.53.0
google-cloud-speech2.32.0
google-cloud-storage2.19.0
google-cloud-texttospeech2.25.1
google-cloud-translate3.20.3
google-cloud-vision3.10.2
huggingface_hub0.33.1
httplib20.22.0
ipyparallel8.6.1
ipython-sql0.3.9
ipywidgets8.1.7
jupyter_contrib_nbextensions0.7.0
jupyter_http_over_ws0.0.8
jupyter_kernel_gateway2.5.2
jupyter_server1.24.0
jupyterhub4.1.6
jupyterlab3.6.8
jupyterlab-git0.44.0
jupyterlab_widgets3.0.15
koalas0.22.0
langchain0.3.26
lightgbm4.6.0
markdown3.5.2
matplotlib3.8.4
mlflow3.1.1
nbconvert7.14.2
nbdime3.2.1
nltk3.9.1
notebook6.5.7
numba0.58.1
numpy1.26.4
oauth2client4.1.3
onnx1.17.0
openblas0.3.25
opencv4.11.0
orc2.1.1
pandas2.1.4
pandas-profiling3.0.0
papermill2.4.0
pyarrow16.1.0
pydot2.0.0
pyhive0.7.0
pynvml12.0.0
pysal23.7
pytables3.9.2
python3.11
regex2023.12.25
requests2.32.2
requests-kerberos0.12.0
rtree1.1.0
scikit-image0.22.0
scikit-learn1.5.2
scipy1.11.4
seaborn0.13.2
sentence-transformers5.0.0
setuptools79.0.1
shap0.48.0
shapely2.1.1
spacy3.8.7
spark-tensorflow-distributor1.0.0
spyder5.5.6
sqlalchemy2.0.41
sympy1.13.3
tensorflow2.18.0
tokenizers0.21.4.dev0
toree0.5.0
torch2.6.0
torch-model-archiver0.11.1
torcheval0.0.7
tornado6.4.2
torchvision0.21.0
traitlets5.14.3
transformers4.53.1
uritemplate4.1.1
virtualenv20.26.6
wordcloud1.9.4
xgboost2.1.4

R libraries

The following R library versions are included in2.3-ml-ubuntu image.

`2.3-ml-ubuntu` image R libraries
Package NameVersion
r-ggplot2 3.4.4
r-irkernel 1.3.2
r-rcurl 1.98-1.16
r-recommended 4.3

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.