Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Device Metrics Exporter exports metrics from AMD devices (GPUs) to collectors like Prometheus.

License

NotificationsYou must be signed in to change notification settings

ROCm/device-metrics-exporter

Repository files navigation

AMD Device Metrics Exporter enables real-time collection of telemetry data in Prometheus format from AMD GPUs in HPC and AI environments. It provides comprehensive metrics including temperature, utilization, memory usage, power consumption, and more.

Quick Start

The Metrics Exporter container is available on Docker Hub:

docker run -d \  --device=/dev/dri \  --device=/dev/kfd \  -p 5000:5000 \  --name device-metrics-exporter \  rocm/device-metrics-exporter:v1.0.0

Features

  • Prometheus-compatible metrics endpoint
  • Rich GPU telemetry data including:
    • Temperature monitoring
    • Utilization metrics
    • Memory usage statistics
    • Power consumption data
    • PCIe bandwidth metrics
  • Kubernetes integration via Helm chart
  • Slurm integration support
  • Configurable service ports
  • Container-based deployment

Requirements

  • Ubuntu 22.04 or later
  • ROCm 6.2.0
  • Docker (or compatible container runtime)

Documentation

For detailed documentation including installation guides, configuration options, and metric descriptions, see thedocumentation.

License

This project is licensed under the Apache 2.0 License - see theLICENSE file for details.

About

Device Metrics Exporter exports metrics from AMD devices (GPUs) to collectors like Prometheus.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp