coder/coderPublic

NotificationsYou must be signed in to change notification settings
Fork1k
Star11.1k

How to create a GPU-enabled development environment with Coder using Docker provider#18722

Unanswered

iomgaa-ycz asked this question inGeneral

iomgaa-ycz

Jul 2, 2025

· 1 comment

Return to top

Discussion options

iomgaa-ycz
Jul 2, 2025

Issue Title

How to create a GPU-enabled development environment with Coder using Docker provider

Description

I'm trying to create a Coder template that provides a GPU-accelerated development environment for machine learning work. I want to build a custom Docker image with CUDA support and have it accessible through Coder's web interface.

Environment

OS: Ubuntu 20.04
GPU: NVIDIA RTX 4090
Docker: GPU support verified working
Coder: 2.24
Terraform: 1.12.2

GPU Environment Verification

I've confirmed that my Docker + GPU setup is working correctly:

docker run --gpus all --rm pytorch/manylinux-cuda118:latest nvidia-smi

This command successfully shows GPU information, confirming that:

NVIDIA Docker runtime is properly configured
GPU passthrough to containers works
CUDA drivers are accessible from within containers

Create a Coder template that builds a custom Docker image with:
- NVIDIA CUDA 11.8 support
- Python development environment with PyTorch, Jupyter Lab, etc.
- GPU monitoring and resource tracking
- VS Code and JetBrains IDE integration
The workspace should:
- Have GPU access (nvidia-smi should work inside the workspace)
- Provide web access to VS Code

Current Issues

I'm encountering several challenges:

When I instantiated the template I built into a workspace, I found that the instance did not have a GPU.

Are there any existing examples or community templates for GPU-enabled Coder workspaces that I could reference?

Any help, examples, or guidance would be greatly appreciated!

terraform {  required_providers {    coder = {      source = "coder/coder"    }    docker = {      source = "kreuzwerker/docker"    }  }}locals {  username = data.coder_workspace_owner.me.name}variable "docker_socket" {  default     = ""  description = "(Optional) Docker socket URI"  type        = string}variable "gpu_enabled" {  default     = true  description = "Enable GPU support for the workspace"  type        = bool}variable "gpu_count" {  default     = "all"  description = "Number of GPUs to allocate (use 'all' for all GPUs, or specify device IDs like '0,1')"  type        = string}provider "docker" {  # Defaulting to null if the variable is an empty string lets us have an optional variable without having to set our own default  host = var.docker_socket != "" ? var.docker_socket : null}data "coder_provisioner" "me" {}data "coder_workspace" "me" {}data "coder_workspace_owner" "me" {}resource "coder_agent" "main" {  arch           = data.coder_provisioner.me.arch  os             = "linux"  startup_script = <<-EOT    set -e    # Create coder user if it doesn't exist    if ! id "coder" &>/dev/null; then        useradd --create-home --shell=/bin/bash --groups=sudo coder        echo "coder ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers.d/90-coder    fi    # Ensure coder user owns the home directory    chown -R coder:coder /home/coder    # Switch to coder user for the rest of the setup    sudo -u coder bash << 'EOF'    # Prepare user home with default files on first start.    if [ ! -f ~/.init_done ]; then      # Create basic shell configuration      echo 'export PATH=$PATH:/usr/local/bin' >> ~/.bashrc      echo 'alias ll="ls -la"' >> ~/.bashrc            # Check GPU availability      if command -v nvidia-smi &> /dev/null; then        echo "GPU detected:"        nvidia-smi        echo 'export CUDA_VISIBLE_DEVICES=all' >> ~/.bashrc      else        echo "No GPU detected or nvidia-smi not available"      fi            # Install basic Python packages      if command -v pip &> /dev/null; then        pip install --user jupyter notebook ipython      fi            touch ~/.init_done    fi    EOF    # Install basic development tools    apt-get update    apt-get install -y curl wget git vim nano htop tree sudo    echo "Workspace setup completed!"  EOT  # These environment variables allow you to make Git commits right away after creating a  # workspace. Note that they take precedence over configuration defined in ~/.gitconfig!  env = {    GIT_AUTHOR_NAME     = coalesce(data.coder_workspace_owner.me.full_name, data.coder_workspace_owner.me.name)    GIT_AUTHOR_EMAIL    = "${data.coder_workspace_owner.me.email}"    GIT_COMMITTER_NAME  = coalesce(data.coder_workspace_owner.me.full_name, data.coder_workspace_owner.me.name)    GIT_COMMITTER_EMAIL = "${data.coder_workspace_owner.me.email}"    # GPU相关环境变量    NVIDIA_VISIBLE_DEVICES = var.gpu_enabled ? "all" : ""    CUDA_VISIBLE_DEVICES   = var.gpu_enabled ? "all" : ""  }  # The following metadata blocks are optional. They are used to display  # information about your workspace in the dashboard.  metadata {    display_name = "CPU Usage"    key          = "0_cpu_usage"    script       = "coder stat cpu"    interval     = 10    timeout      = 1  }  metadata {    display_name = "RAM Usage"    key          = "1_ram_usage"    script       = "coder stat mem"    interval     = 10    timeout      = 1  }  metadata {    display_name = "GPU Usage"    key          = "2_gpu_usage"    script       = <<EOT      if command -v nvidia-smi &> /dev/null; then        nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader,nounits | head -1 | xargs printf "%s%%"      else        echo "No GPU"      fi    EOT    interval     = 10    timeout      = 1  }  metadata {    display_name = "GPU Memory"    key          = "3_gpu_memory"    script       = <<EOT      if command -v nvidia-smi &> /dev/null; then        nvidia-smi --query-gpu=memory.used,memory.total --format=csv,noheader,nounits | head -1 | awk '{printf "%.1f/%.1f GB", $1/1024, $2/1024}'      else        echo "No GPU"      fi    EOT    interval     = 10    timeout      = 1  }  metadata {    display_name = "Home Disk"    key          = "4_home_disk"    script       = "coder stat disk --path $${HOME}"    interval     = 60    timeout      = 1  }  metadata {    display_name = "CPU Usage (Host)"    key          = "5_cpu_usage_host"    script       = "coder stat cpu --host"    interval     = 10    timeout      = 1  }  metadata {    display_name = "Memory Usage (Host)"    key          = "6_mem_usage_host"    script       = "coder stat mem --host"    interval     = 10    timeout      = 1  }  metadata {    display_name = "Load Average (Host)"    key          = "7_load_host"    script   = <<EOT      echo "`cat /proc/loadavg | awk '{ print $1 }'` `nproc`" | awk '{ printf "%0.2f", $1/$2 }'    EOT    interval = 60    timeout  = 1  }  metadata {    display_name = "Swap Usage (Host)"    key          = "8_swap_host"    script       = <<EOT      free -b | awk '/^Swap/ { printf("%.1f/%.1f", $3/1024.0/1024.0/1024.0, $2/1024.0/1024.0/1024.0) }'    EOT    interval     = 10    timeout      = 1  }}# See https://registry.coder.com/modules/coder/code-servermodule "code-server" {  count  = data.coder_workspace.me.start_count  source = "registry.coder.com/modules/code-server/coder"  version = "~> 1.0"  agent_id = coder_agent.main.id  order    = 1}# See https://registry.coder.com/modules/coder/jetbrains-gatewaymodule "jetbrains_gateway" {  count  = data.coder_workspace.me.start_count  source = "registry.coder.com/modules/jetbrains-gateway/coder"  version = "~> 1.0"  # JetBrains IDEs to make available for the user to select  jetbrains_ides = ["IU", "PS", "WS", "PY", "CL", "GO", "RM", "RD", "RR"]  default        = "PY"  # 默认使用PyCharm Professional，适合GPU开发  # Default folder to open when starting a JetBrains IDE  folder = "/home/coder"  agent_id   = coder_agent.main.id  agent_name = "main"  order      = 2}resource "docker_volume" "home_volume" {  name = "coder-${data.coder_workspace.me.id}-home"  # Protect the volume from being deleted due to changes in attributes.  lifecycle {    ignore_changes = all  }  # Add labels in Docker to keep track of orphan resources.  labels {    label = "coder.owner"    value = data.coder_workspace_owner.me.name  }  labels {    label = "coder.owner_id"    value = data.coder_workspace_owner.me.id  }  labels {    label = "coder.workspace_id"    value = data.coder_workspace.me.id  }  labels {    label = "coder.workspace_name_at_creation"    value = data.coder_workspace.me.name  }}resource "docker_container" "workspace" {  count = data.coder_workspace.me.start_count    # 使用支持GPU的PyTorch镜像  image = "pytorch/manylinux-cuda118:latest"    # Uses lower() to avoid Docker restriction on container names.  name = "coder-${data.coder_workspace_owner.me.name}-${lower(data.coder_workspace.me.name)}"    # Hostname makes the shell more user friendly: coder@my-workspace:~$  hostname = data.coder_workspace.me.name    # Use the docker gateway if the access URL is 127.0.0.1  entrypoint = ["sh", "-c", replace(coder_agent.main.init_script, "/localhost|127\\.0\\.0\\.1/", "host.docker.internal")]    env = [    "CODER_AGENT_TOKEN=${coder_agent.main.token}",    "NVIDIA_VISIBLE_DEVICES=${var.gpu_enabled ? var.gpu_count : ""}",    "CUDA_VISIBLE_DEVICES=${var.gpu_enabled ? var.gpu_count : ""}",    "NVIDIA_DRIVER_CAPABILITIES=compute,utility"  ]  # GPU配置  runtime = var.gpu_enabled ? "nvidia" : null    # 如果启用GPU，配置GPU访问  dynamic "device_requests" {    for_each = var.gpu_enabled ? [1] : []    content {      driver       = "nvidia"      count        = var.gpu_count == "all" ? -1 : null      device_ids   = var.gpu_count != "all" ? split(",", var.gpu_count) : null      capabilities = [["gpu"]]    }  }  host {    host = "host.docker.internal"    ip   = "host-gateway"  }    volumes {    container_path = "/home/coder"    volume_name    = docker_volume.home_volume.name    read_only      = false  }  # 添加共享内存大小，对于深度学习很有用  shm_size = 2048  # Add labels in Docker to keep track of orphan resources.  labels {    label = "coder.owner"    value = data.coder_workspace_owner.me.name  }  labels {    label = "coder.owner_id"    value = data.coder_workspace_owner.me.id  }  labels {    label = "coder.workspace_id"    value = data.coder_workspace.me.id  }  labels {    label = "coder.workspace_name"    value = data.coder_workspace.me.name  }}# 输出GPU状态信息resource "coder_metadata" "workspace_info" {  count       = data.coder_workspace.me.start_count  resource_id = docker_container.workspace[0].id  item {    key   = "image"    value = docker_container.workspace[0].image  }    item {    key   = "gpu_enabled"    value = var.gpu_enabled  }    item {    key   = "gpu_config"    value = var.gpu_enabled ? var.gpu_count : "disabled"  }}

You must be logged in to vote

Replies: 1 comment

Comment options

matifali
Jul 2, 2025
Maintainer

I have a few that I am not actively maintaining athttps://github.com/matifali/coder-templates. Let me know if they work for you.

You must be logged in to vote

0 replies

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to create a GPU-enabled development environment with Coder using Docker provider#18722

Uh oh!

{{title}}

Uh oh!

iomgaa-ycz
Jul 2, 2025

Issue Title

Description

Environment

GPU Environment Verification

Current Issues

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

matifali
Jul 2, 2025
Maintainer

Select a reply

Uh oh!

Movatterモバイル変換

How to create a GPU-enabled development environment with Coder using Docker provider#18722

Uh oh!

iomgaa-yczJul 2, 2025

Issue Title

Description

Environment

GPU Environment Verification

Current Issues

Replies: 1 comment

Uh oh!

matifaliJul 2, 2025 Maintainer

Uh oh!

iomgaa-ycz
Jul 2, 2025

matifali
Jul 2, 2025
Maintainer