Movatterモバイル変換


[0]ホーム

URL:


nvidia_logo.svg
Docs Hub
NVIDIA Optimized Frameworks
NVIDIA Docs Hub Homepage  NVIDIA Optimized Frameworks  NVIDIA Optimized Frameworks  TensorFlow Release 23.02

TensorFlow Release 23.02

The NVIDIA container image of TensorFlow, release 23.02, is available onNGC.

Contents of the TensorFlow container

This container image includes the complete source of the NVIDIA version of TensorFlow in/opt/tensorflow. It is prebuilt and installed as a system Python module.

To achieve optimum TensorFlow performance for image-based training, the container includes a sample script that demonstrates the efficient training of convolutional neural networks (CNNs). The sample script might need to be modified to fit your application.

The container also includes the following:

Driver Requirements

Release 23.02 is based onCUDA 12.0.1, which requiresNVIDIA Driver release 525 or later. However, if you are running on a data center GPU (for example, T4 or any other data center GPU), you can use NVIDIA driver release 450.51 (or later R450), 470.57 (or later R470), 510.47 (or later R510), 515.65 (or later R515), or 525.85 (or later R525). The CUDA driver's compatibility package only supports particular drivers. Thus, users should upgrade from all R418, R440, R460, and R520 drivers, which are not forward-compatible with CUDA 12.0. For a complete list of supported drivers, see theCUDA Application Compatibility topic. For more information, seeCUDA Compatibility and Upgrades.

GPU Requirements

Release 23.02 supports CUDA compute capability 6.0 and later. This corresponds to GPUs in the NVIDIA Pascal, NVIDIA Volta™, NVIDIA Turing™, NVIDIA Ampere architecture, and NVIDIA Hopper™ architecture families. For a list of GPUs to which this compute capability corresponds, seeCUDA GPUs. For additional support details, seeDeep Learning Frameworks Support Matrix.

Key Features and Enhancements

This TensorFlow release includes the following key features and enhancements.

Announcements

  • Starting with the 22.05 release, the TensorFlow 1 and 2 containers are available for the Arm SBSA platform.

    For example, pulling the Docker imagenvcr.io/nvidia/tensorflow:22.05-tf2-py3 Docker image on an Arm SBSA machine will automatically fetch the Arm-specific image.

  • Support for Slurm PMI2 has been removed from the 22.01 release.

    PMIX is supported by the container, but is not supported by default in Slurm. Users who depend on Slurm integration might need to configure Slurm for PMIX in the base OS as appropriate to their OS distribution (for Ubuntu 20.04, the required package isslurm-wlm-basic-plugins).

NVIDIA TensorFlow Container Versions

The following table shows what versions of Ubuntu, CUDA, TensorFlow, and TensorRT are supported in each of the NVIDIA containers for TensorFlow. For older container versions, refer to theFrameworks Support Matrix.

Container VersionUbuntuCUDA ToolkitTensorFlowTensorRT
23.0220.04NVIDIA CUDA 12.0.1

2.11.0

1.15.5

TensorRT 8.5.3
23.01TensorRT 8.5.2.2
22.12NVIDIA CUDA 11.8.0

2.10.1

1.15.5

TensorRT 8.5.1
22.11

2.10.0

1.15.5

22.10TensorRT 8.5 EA
22.09

2.9.1

1.15.5

22.08NVIDIA CUDA 11.7.1TensorRT 8.4.2.4
22.07NVIDIA CUDA 11.7 Update 1 PreviewTensorRT 8.4.1
22.06TensorRT 8.2.5
22.05NVIDIA CUDA 11.7.0

2.8.0

1.15.5

22.04NVIDIA CUDA 11.6.2TensorRT 8.2.4.2
22.03NVIDIA CUDA 11.6.1TensorRT 8.2.3
22.02NVIDIA CUDA 11.6.0

2.7.0

1.15.5

TensorRT 8.2.3
22.01TensorRT 8.2.2
21.12NVIDIA CUDA 11.5.0

2.6.2

1.15.5

TensorRT 8.2.1.8
21.11

2.6.0

1.15.5

TensorRT 8.0.3.4 for x64 Linux

TensorRT 8.0.2.2 for Arm SBSA Linux

21.10NVIDIA CUDA 11.4.2 withcuBLAS 11.6.5.2
21.09NVIDIA CUDA 11.4.2TensorRT 8.0.3
21.08NVIDIA CUDA 11.4.1

2.5.0

1.15.5

TensorRT 8.0.1.6
21.07NVIDIA CUDA 11.4.0
21.06NVIDIA CUDA 11.3.1

2.5.0

1.15.5

TensorRT 7.2.3.4
21.05NVIDIA CUDA 11.3.0

2.4.0

1.15.5

21.04
21.03NVIDIA CUDA 11.2.1TensorRT 7.2.2.3
21.02NVIDIA CUDA 11.2.0TensorRT 7.2.2.3+cuda11.1.0.024
20.12NVIDIA CUDA 11.1.1

2.3.1

1.15.4

TensorRT 7.2.2
20.11

18.04

NVIDIA CUDA 11.1.0TensorRT 7.2.1
20.10
20.09NVIDIA CUDA 11.0.3

2.3.0

1.15.3

TensorRT 7.1.3
20.08

2.2.0

1.15.3

20.07NVIDIA CUDA 11.0.194
20.06NVIDIA CUDA 11.0.167

2.2.0

1.15.2

TensorRT 7.1.2

20.03

20.02

NVIDIA CUDA 10.2.89

2.1.0

1.15.2

TensorRT 7.0.0

20.01

2.0.0

1.15.0

19.12

19.11

TensorRT 6.0.1
19.10NVIDIA CUDA 10.1.2431.14.0
19.09
19.08TensorRT 5.1.5

Tensor Core Examples

Thetensor core examples provided in GitHub focus on achieving the best performance and convergence by using the latestdeep learning example networks andmodel scripts for training.

Each example model trains with mixed precision Tensor Cores on NVIDIA Volta, therefore you can get results much faster than training without Tensor Cores. This model is tested against each NGC monthly container release to ensure consistent accuracy and performance over time.

Known Issues

  • The default set of Keras optimizers are not currently compatible with Horovod, see github issues[1],[2]. Using the old optimizers (available now under tf.keras.optimizers.legacy) resolves the errors.
  • Some DLRM models may regress by 10-40%. We are currently investigating.
  • A known performance regression of up to 50% affects some efficientnet models. The regression is inherited from upstream tensorflow and is still under investigation. It will be fixed in a subsequent release.
  • The TF-TRT native segment fallback has a known issue that causes a crash.

    This issue occurs when you use TF-TRT to convert a model with a subgraph that is then converted to TensorRT, but the conversion fails to build. Instead of falling back to native TensorFlow, TF-TRT will crash.

    To prevent the conversion of an OP that causes a native segment fallback, useexport TF_TRT_OP_DENYLIST="ProblematicOp".

  • A knownissue affectsaarch64 libgomp, which might sometimes causecannot allocate memory in static TLS block errors.

    The workaround is to run the following command:

    Copy
    Copied!

    export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libgomp.so.1

  • IO-dominated CNN models, such as AlexNet and ResNet50 see a ~10% performance reduction on some platforms. The regression is under investigation and will be fixed in a future release.
  • In some configurations, the UNet3D model on A100 fails to initialize CUDNN due to an OOM. This can be fixed by increasing the GPU memory carveout with the environment variableTF_DEVICE_MIN_SYS_MEMORY_IN_MB=2000.
  • There is a known performance regression in XLA that can cause performance regressions of up to 55% when training certain models such as EfficientNet with XLA enabled. The root cause is under investigation and will be fixed in a future release.
  • On H100 NVLink systems using 2 GPUs for training, certain communication patterns can trigger a corner-case bug that manifests either as a hang or as an "illegal instruction" exception. A workaround for this case is to set the environment variableNCCL_PROTO=^LL128. This issue will be addressed in an upcoming release.
  • Within the TF1 container on T4 GPUs, the MaskRCNN model may fail with either the low accuracy or illegal memory access. The root cause is under investigation and will be fixed in a future release.

© Copyright 2025, NVIDIA.Last updated on Oct 29, 2025.
Topics
content here

[8]ページ先頭

©2009-2025 Movatter.jp