
Back2Basics: Running Workloads on Amazon EKS
Overview
Welcome back to theBack2Basics
series! In this part, we'll explore howKarpenter
, a just-in-time node provisioner, automatically manages nodes based on your workload needs. We'll also walk you through deploying a voting application to showcase this functionality in action.
If you haven't read the first part, you can check it out here:


Back2Basics: Setting Up an Amazon EKS Cluster
Romar Cablao for AWS Community Builders ・ Jun 12
Infrastructure Setup
In the previous post, we covered the fundamentals of cluster provisioning usingOpenTofu
and simple workload deployment. Now, we will enable additional addons includingKarpenter
for automatic node provisioning based on workload needs.
First we need to uncomment these lines in03_eks.tf
to create taints on the nodes managed by the initial node group.
# Uncomment this if you will use Karpenter # taints = { # init = { # key = "node" # value = "initial" # effect = "NO_SCHEDULE" # } # }
Taints ensure that only pods configured to tolerate these taints can be scheduled on those nodes. This allows us to reserve the initial nodes for specific purposes whileKarpenter
provisions additional nodes for other workloads.
We also need to uncomment the codes in04_karpenter
and05_addons
to activateKarpenter
and provision other addons.
Once updated, we have to runtofu init
,tofu plan
andtofu apply
. When prompted to confirm, typeyes
to proceed with provisioning the additional resources.
Karpenter
Karpenter is an open-source project that automates node provisioning in Kubernetes clusters. By integrating with EKS, Karpenter dynamically scales the cluster by adding new nodes when workloads require additional resources and removing idle nodes to optimize costs. The Karpenter configuration defines different node classes and pools for specific workload types, ensuring efficient resource allocation. Read more:https://karpenter.sh/docs/
The template04_karpenter
defines several node classes and pools categorized by workload type. These include:
critical-workloads
: for running essential cluster addonsmonitoring
: dedicated to Grafana and other monitoring toolsvote-app
: for the voting application we'll be deploying
Workload Setup
The voting application consists of several components:vote
,result
,worker
,redis
, andpostgresql
. While we'll deploy everything on Kubernetes for simplicity, you can leverage managed services likeAmazon ElastiCache for Redis
andAmazon RDS
for a production environment.
Component | Description |
---|---|
Vote | Handles receiving and processing votes. |
Result | Provides real-time visualizations of the current voting results. |
Worker | Synchronizes votes between Redis and PostgreSQL. |
Redis | Stores votes temporarily, easing the load on PostgreSQL. |
PostgreSQL | Stores all votes permanently for secure and reliable data access. |
Here's the Voting App UI for both voting and results.
Deployment Using Kubernetes Manifest
If you explore theworkloads/manifest
directory, you'll find separate YAML files for each workload. Let's take a closer look at the components used for stateful applications likepostgres
andredis
:
apiVersion: v1kind: Secret...---apiVersion: v1kind: PersistentVolumeClaim...---apiVersion: apps/v1kind: StatefulSet...---apiVersion: v1kind: Service...
As you may see,Secret
,PersistentVolumeClaim
,StatefulSet
andService
were used forpostgres
andredis
. Let's take a quick review of the following API objects used:
Secret
- used to store and manage sensitive information such as passwords, tokens, and keys.PersistentVolumeClaim
- a request for storage, used to provision persistent storage dynamically.StatefulSet
- manages stateful applications with guarantees about the ordering and uniqueness ofpods
.Service
- used for exposing an application that is running as one or morepods
in the cluster.
Now, lets viewvote-app.yaml
,results-app.yaml
andworker.yaml
:
apiVersion: v1kind: ConfigMap...---apiVersion: apps/v1kind: Deployment...---apiVersion: v1kind: Service...
Similar topostgres
andredis
, we have used a service for stateless workloads. Then we introduce the use ofConfigmap
andDeployment
.
Configmap
- stores non-confidential configuration data in key-value pairs, decoupling configurations from code.Deployment
- used to provide declarative updates forpods
andreplicasets
, typically used for stateless workloads.
And lastly theingress.yaml
. To make our service accessible from outside the cluster, we'll use anIngress
. This API object manages external access to the services in a cluster, typically in HTTP/S.
apiVersion: networking.k8s.io/v1kind: Ingress...
Now that we've examined the manifest files, let's deploy them to the cluster. You can use the following command to apply all YAML files within theworkloads/manifest/
directory:
kubectl apply -f workloads/manifest/
For more granular control, you can apply each YAML file individually. To clean up the deployment later, simply runkubectl delete -f workloads/manifest/
While manifest files are a common approach, there are alternative tools for deployment management:
Kustomize
: This tool allows customizing raw YAML files for various purposes without modifying the original files.Helm
: A popular package manager for Kubernetes applications. Helm charts provide a structured way to define, install, and upgrade even complex applications within the cluster.
Deployment Using Kustomize
Let's checkKustomize
. If you haven't installed it's binary, you can refer toKustomize Installation Docs. This example utilizes an overlay file to make specific changes to the default configuration. To apply the builtkustomization
, you can run the command:
kustomize build .\workloads\kustomize\overlays\dev\ | kubectl apply -f -
Here's what we've modified:
- Added an annotation:
note: "Back2Basics: A Series"
. - Set the replicas for both the
vote
andresult
deployments to3
.
To check you can refer to the commands below:
D:\> kubectl get pod -o custom-columns=NAME:.metadata.name,ANNOTATIONS:.metadata.annotationsNAME ANNOTATIONSpostgres-0 map[note:Back2Basics: A Series]redis-0 map[note:Back2Basics: A Series]result-app-6c9dd6d458-8hxkf map[note:Back2Basics: A Series]result-app-6c9dd6d458-l4hp9 map[note:Back2Basics: A Series]result-app-6c9dd6d458-r5srd map[note:Back2Basics: A Series]vote-app-cfd5fc88-lsbzx map[note:Back2Basics: A Series]vote-app-cfd5fc88-mdblb map[note:Back2Basics: A Series]vote-app-cfd5fc88-wz5ch map[note:Back2Basics: A Series]worker-bf57ddcb8-kkk79 map[note:Back2Basics: A Series]D:\> kubectl get deployNAME READY UP-TO-DATE AVAILABLE AGEresult-app 3/3 3 3 5mvote-app 3/3 3 3 5mworker 1/1 1 1 5m
To remove all the resources we created, run the following command:
kustomize build .\workloads\kustomize\overlays\dev\ | kubectl delete -f -
Deployment Using Helm Chart
Next to check isHelm
. If you haven't installed helm binary, you can refer toHelm Installation Docs. Once installed, lets add a repository and update.
helm repo add thecloudspark https://thecloudspark.github.io/helm-chartshelm repo update
Next, create avalues.yaml
and add some overrides to the default configuration. You can also use existing config inworkloads/helm/values.yaml
. This is how it looks like:
ingress: enabled: true className: alb annotations: alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/target-type: instance# Vote Handler Configvote: tolerations: - key: app operator: Equal value: vote-app effect: NoSchedule nodeSelector: app: vote-app service: type: NodePort# Results Handler Configresult: tolerations: - key: app operator: Equal value: vote-app effect: NoSchedule nodeSelector: app: vote-app service: type: NodePort# Worker Handler Configworker: tolerations: - key: app operator: Equal value: vote-app effect: NoSchedule nodeSelector: app: vote-app
As you may see, we addednodeSelector
andtolerations
to make sure that thepods
will be scheduled on the dedicated nodes where we wanted them to run. This Helm chart offers various configuration options and you can explore them in more detail onArtifactHub: Vote App.
Now install the chart and apply overrides from values.yaml
# Installhelm install app -f workloads/helm/values.yaml thecloudspark/vote-app# Upgradehelm upgrade app -f workloads/helm/values.yaml thecloudspark/vote-app
Wait for the pods to be up and running, then access the UI using the provisioned application load balancer.
To uninstall just run the command below.
helm uninstall app
Going back to Karpenter
Under the hood,Karpenter
provisioned nodes used by the voting app we've deployed. The sample logs you see here provide insights into it's activities:
{"level":"INFO","time":"2024-06-16T10:15:38.739Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"fb4d75f","pods":"default/result-app-6c9dd6d458-l4hp9, default/worker-bf57ddcb8-kkk79, default/vote-app-cfd5fc88-lsbzx","duration":"153.662007ms"}{"level":"INFO","time":"2024-06-16T10:15:38.739Z","logger":"controller.provisioner","message":"computed new nodeclaim(s) to fit pod(s)","commit":"fb4d75f","nodeclaims":1,"pods":3}{"level":"INFO","time":"2024-06-16T10:15:38.753Z","logger":"controller.provisioner","message":"created nodeclaim","commit":"fb4d75f","nodepool":"vote-app","nodeclaim":"vote-app-r9z7s","requests":{"cpu":"510m","memory":"420Mi","pods":"8"},"instance-types":"m5.2xlarge, m5.4xlarge, m5.large, m5.xlarge, m5a.2xlarge and 55 other(s)"}{"level":"INFO","time":"2024-06-16T10:15:41.894Z","logger":"controller.nodeclaim.lifecycle","message":"launched nodeclaim","commit":"fb4d75f","nodeclaim":"vote-app-r9z7s","provider-id":"aws:///ap-southeast-1b/i-028457815289a8470","instance-type":"t3.small","zone":"ap-southeast-1b","capacity-type":"spot","allocatable":{"cpu":"1700m","ephemeral-storage":"14Gi","memory":"1594Mi","pods":"11"}}{"level":"INFO","time":"2024-06-16T10:16:08.946Z","logger":"controller.nodeclaim.lifecycle","message":"registered nodeclaim","commit":"fb4d75f","nodeclaim":"vote-app-r9z7s","provider-id":"aws:///ap-southeast-1b/i-028457815289a8470","node":"ip-10-0-206-99.ap-southeast-1.compute.internal"}{"level":"INFO","time":"2024-06-16T10:16:23.631Z","logger":"controller.nodeclaim.lifecycle","message":"initialized nodeclaim","commit":"fb4d75f","nodeclaim":"vote-app-r9z7s","provider-id":"aws:///ap-southeast-1b/i-028457815289a8470","node":"ip-10-0-206-99.ap-southeast-1.compute.internal","allocatable":{"cpu":"1700m","ephemeral-storage":"15021042452","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"1663292Ki","pods":"11"}}
As shown in the logs, whenKarpenter
found pod/s that needs to be scheduled, a new node claim was created, launched and initialized. So whenever there is a need for additional resources, this component is responsible in fulfilling it.
Additionally,Karpenter
automatically labels nodes it provisions withkarpenter.sh/initialized=true
. Let's usekubectl
to see these nodes:
kubectl get nodes -l karpenter.sh/initialized=true
This command will list all nodes that have this specific label. As you can see in the output below, three nodes have been provisioned byKarpenter
:
NAME STATUS ROLES AGE VERSIONip-10-0-208-50.ap-southeast-1.compute.internal Ready <none> 10m v1.30.0-eks-036c24bip-10-0-220-238.ap-southeast-1.compute.internal Ready <none> 10m v1.30.0-eks-036c24bip-10-0-206-99.ap-southeast-1.compute.internal Ready <none> 1m v1.30.0-eks-036c24b
Lastly, let's check related logs for node termination. This process involves removing nodes from the cluster. Decommissioning typically involves tainting the node first to prevent furtherpod
scheduling, followed by node deletion.
{"level":"INFO","time":"2024-06-16T10:35:39.165Z","logger":"controller.disruption","message":"disrupting via consolidation delete, terminating 1 nodes (0 pods) ip-10-0-206-99.ap-southeast-1.compute.internal/t3.small/spot","commit":"fb4d75f","command-id":"5e5489a6-a99d-4b8d-912c-df314a4b5cfa"}{"level":"INFO","time":"2024-06-16T10:35:39.483Z","logger":"controller.disruption.queue","message":"command succeeded","commit":"fb4d75f","command-id":"5e5489a6-a99d-4b8d-912c-df314a4b5cfa"}{"level":"INFO","time":"2024-06-16T10:35:39.511Z","logger":"controller.node.termination","message":"tainted node","commit":"fb4d75f","node":"ip-10-0-206-99.ap-southeast-1.compute.internal"}{"level":"INFO","time":"2024-06-16T10:35:39.530Z","logger":"controller.node.termination","message":"deleted node","commit":"fb4d75f","node":"ip-10-0-206-99.ap-southeast-1.compute.internal"}{"level":"INFO","time":"2024-06-16T10:35:39.989Z","logger":"controller.nodeclaim.termination","message":"deleted nodeclaim","commit":"fb4d75f","nodeclaim":"vote-app-r9z7s","node":"ip-10-0-206-99.ap-southeast-1.compute.internal","provider-id":"aws:///ap-southeast-1b/i-028457815289a8470"}
What's Next?
We've successfully deployed our voting application! And thanks toKarpenter
, new nodes are added automatically when needed and terminates when not - making our setup more robust and cost effective. In the final part of this series, we'll delve into monitoring the voting application we've deployed withGrafana
andPrometheus
, providing us the visibility into resource utilization and application health.
Top comments(0)
For further actions, you may consider blocking this person and/orreporting abuse