- Topics
- Artificial intelligence
- How Kubernetes can help AI/ML
How Kubernetes can help AI/ML
Overview
Kubernetes can assist with AI/ML workloads by making code consistently reproducible, portable, and scalable across diverse environments.
The role of containers in AI/ML development
When buildingmachine learning enabled applications, the workflow is not linear, and the stages of research, development, and production are in perpetual motion as teams work to continuously integrate and continuously deliver (CI/CD). The process of building, testing, merging, and deploying new data, algorithms, and versions of an application creates a lot of moving pieces, which can be difficult to manage. That’s wherecontainers come in.
Containers are aLinuxtechnology that allow you to package and isolate applications along with all the libraries and dependencies it needs to run. Containers don’t require an entire operating system, only the exact components it needs to operate, which makes it lightweight and portable. This provides an ease of deployment for operations and confidence for developers that their applications will run exactly the same way on different platforms or operating systems.
Another benefit of containers is that they help reduce conflicts between your development and operations teams by separating areas of responsibility. And when developers can focus on their apps and operations teams can focus on the infrastructure, integrating new code into an application as it grows and evolves throughout its lifecycle becomes more seamless and efficient.
Red Hat resources
What Kubernetes brings to AI/ML workloads
Kubernetes is an open source platform that automates Linux container operations by eliminating many of the manual processes involved in deploying and scaling containerized applications. Kubernetes is key to streamlining the machine learning lifecycle as it provides data scientists the agility, flexibility, portability, and scalability to train, test, and deploy ML models.
Scalability: Kubernetes allows users to scale ML workloads up or down, depending on demand. This ensures that machine learning pipelines can accommodate large-scale processing and training without interfering with other elements of the project.
Efficiency: Kubernetes optimizes resource allocation by scheduling workloads onto nodes based on their availability and capacity. By ensuring that computing resources are being utilized with intention, users can expect a reduction in cost and an increase in performance.
Portability: Kubernetes provides a standardized, platform-agnostic environment that allows data scientists to develop one ML model and deploy it across multiple environments and cloud platforms. This means not having to worry about compatibility issues and vendor lock-in.
Fault tolerance: With built-in fault tolerance and self-healing capabilities, users can trust Kubernetes to keep ML pipelines running even in the event of a hardware or software failure.
Deploying ML models on Kubernetes
The machine learning lifecycle is made up of many different elements that, if managed separately, would be time consuming and resource intensive to operate and maintain. With a Kubernetes architecture, organizations can automate portions of the ML lifecycle, removing the need for manual intervention and creating more efficiency.
Toolkits such asKubeflow can be implemented to assist developers in streamlining and serving the trained ML workloads on Kubernetes. Kubeflow solves many of the challenges involved in orchestrating machine learning pipelines by providing a set of tools and APIs that simplify the process of training and deploying ML models at scale. Kubeflow also helps standardize and organizemachine learning operations (MLOps).
How Red Hat can help
Kubernetes can help you streamline AI/ML workloads, but you still need a platform to experiment, serve models, and deliver your applications.
Red Hat® AI is our portfolio of AI products built on solutions our customers already trust. This foundation helps our products remain reliable, flexible, and scalable.
Red Hat AI can help organizations:
- Adopt and innovate with AI quickly.
- Break down the complexities of delivering AI solutions.
- Deploy anywhere.
Stay flexible while you scale
Red Hat AI includes Red Hat OpenShift AI: an integrated MLOps platform that can manage the lifecycle of bothpredictive and generative AI models.
This AI platform provides a space to build, train, deploy, and monitor AI/ML workloads in on-premise datacenters or closer to where data is located. This makes it easier to scale operations to the cloud or at the edge when needed.
Layering kubernetes with Red Hat AI will allow your team to stay nimble when delivering AI applications across hybrid cloud environments.
Solution pattern: AI apps with Red Hat & NVIDIA AI Enterprise
Create a RAG application
Red Hat OpenShift AI is a platform for building data science projects and serving AI-enabled applications. You can integrate all the tools you need to support retrieval-augmented generation (RAG), a method for getting AI answers from your own reference documents. When you connect OpenShift AI with NVIDIA AI Enterprise, you can experiment with large language models (LLMs) to find the optimal model for your application.
Build a pipeline for documents
To make use of RAG, you first need to ingest your documents into a vector database. In our example app, we embed a set of product documents in a Redis database. Since these documents change frequently, we can create a pipeline for this process that we’ll run periodically, so we always have the latest versions of the documents.
Browse the LLM catalog
NVIDIA AI Enterprise gives you access to a catalog of different LLMs, so you can try different choices and select the model that delivers the best results. The models are hosted in the NVIDIA API catalog. Once you’ve set up an API token, you can deploy a model using the NVIDIA NIM model serving platform directly from OpenShift AI.
Choose the right model
As you test different LLMs, your users can rate each generated response. You can set up a Grafana monitoring dashboard to compare the ratings, as well as latency and response time for each model. Then you can use that data to choose the best LLM to use in production.

The official Red Hat blog
Get the latest information about our ecosystem of customers, partners, and communities.
All Red Hat product trials
Keep reading
What is AI in the public sector?
SLMs vs LLMs: What are small language models?
What is enterprise AI?
Artificial intelligence resources
Featured product
Red Hat OpenShift AI
An artificial intelligence (AI) platform that provides tools to rapidly develop, train, serve, and monitor models and AI-enabled applications.
Related articles
- What is AI in the public sector?
- What is a Linux container?
- SLMs vs LLMs: What are small language models?
- What is enterprise AI?
- What is container orchestration?
- What is parameter-efficient fine-tuning (PEFT)?
- Why choose Red Hat Ansible Automation Platform as your AI foundation?
- LoRA vs. QLoRA
- Stateful vs stateless applications
- What is vLLM?
- What is AI inference?
- What is Kubernetes?
- Red Hat OpenShift on VMware
- Predictive AI vs generative AI
- What is agentic AI?
- What is KubeVirt?
- What are Granite models?
- Why use Red Hat Ansible Automation Platform with Red Hat OpenShift?
- RAG vs. fine-tuning
- What is Podman Desktop?
- Understanding AI in telecommunications with Red Hat
- Edge solutions for real-time decision making
- What are intelligent applications?
- What is Podman?
- What is retrieval-augmented generation?
- What is the Kubernetes Java client?
- What are hosted control planes?
- What is kubernetes security?
- What is InstructLab?
- What is an AI platform?
- Red Hat OpenShift for developers
- What is LLMops
- What are predictive analytics
- What is deep learning?
- AI in banking
- Containers vs VMs
- Edge computing with Red Hat OpenShift
- AI infrastructure explained
- Understanding AI/ML use cases
- What is MLOps?
- What are large language models?
- What are foundation models for AI?
- What is AIOps?
- What is generative AI?
- What is edge AI?
- What is Kubeflow?
- What are microservices?
- OpenShift vs. OpenStack: What are the differences?
- What is container security?
- What are sandboxed containers
- what is Buildah?
- Accelerate MLOps with Red Hat OpenShift
- Kubernetes vs OpenStack
- What are validated patterns?
- Kubernetes on AWS: Self-Managed vs. Managed Applications Platforms
- What is an image builder?
- What is machine learning?
- Red Hat OpenShift vs. OKD
- What is AI in healthcare?
- Red Hat OpenShift vs. Kubernetes: What's the difference?
- Why run Apache Kafka on Kubernetes?
- Spring on Kubernetes with Red Hat OpenShift
- What is high availability and disaster recovery for containers?
- What is a golden image?
- What are Red Hat OpenShift cloud services?
- VNF and CNF, what’s the difference?
- What is a container registry?
- What is Skopeo?
- What are Red Hat OpenShift Operators?
- Using Helm with Red Hat OpenShift
- Kubernetes security best practices
- Orchestrating Windows containers on Red Hat OpenShift
- What is a Kubernetes operator?
- High performance computing with Red Hat OpenShift
- Advantages of Kubernetes-native security
- What is KubeLinter?
- Container and Kubernetes compliance considerations
- Intro to Kubernetes security
- How microservices support IT integration in healthcare
- Kubernetes cluster management
- Red Hat OpenShift on IBM IT infrastructure
- Red Hat OpenShift for business leaders
- Cost management for Kubernetes on Red Hat OpenShift
- How to deploy Red Hat OpenShift
- Kubernetes-native Java development with Quarkus
- What is enterprise Kubernetes?
- What makes Red Hat OpenShift the right choice for IT operations?
- Why choose Red Hat for Kubernetes?
- What makes Red Hat OpenShift the right choice for SAP?
- What is Kubernetes role-based access control (RBAC)
- What is containerization?
- What was CoreOS and CoreOS container Linux
- Learning Kubernetes basics
- What is service-oriented architecture?
- What is the Kubernetes API?
- What is Kubernetes cluster management?
- What is a Kubernetes deployment?
- Why choose the Red Hat build of Quarkus?
- Introduction to Kubernetes architecture
- What is CaaS?
- Introduction to Kubernetes patterns
- What is a Kubernetes cluster?
- What is Quarkus?
- What is Jaeger?
- What is Clair?
- What is Knative?
- What is etcd?
- What is container-native virtualization?
- Why choose Red Hat for microservices?
- Why choose Red Hat for containers?
- What is Docker?
- What is a Kubernetes pod?