Posts by Antti-Ville Suni

AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs

17 November 2025

Andy Allred ,William Anzen ,Alexander Aurell ,Aravind Kumar Rao Bappanadu ,Thomas Bergstrom ,Sander Bijl de Vroe ,Marc Dillon ,Stanislau Fink ,Alexander Finn ,Mark van Heeswijk ,Andrey Ivannikov ,Teemu Karkkainen ,Shashank Kashyap ,Juho Kerttula ,Miikael Leskinen ,Hari Nair ,Mika Ranta ,Mario Reiser ,Alex Saliniemi ,Rui Sampaio ,Harry Souris ,Antti-Ville Suni ,Robert Talling ,Juho Vainio ,Mikko Vilenius ,Yu Wang ,Bo Zhang

English

Applications & models

AI/ML GenAI LLM Performance Serving Kubernetes

As generative AI models continue to expand in scale, context length, andoperational complexity, enterprises face a harder challenge: how todeploy and operate inference reliably, efficiently, and at productionscale. Running LLMs or multimodal models on real workloads requires morethan high-performance GPUs. It requires reproducible deployments,predictable performance, seamless orchestration, and an operationalframework that teams can trust.

Read more ...

Movatterモバイル変換

Posts by Antti-Ville Suni

AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs