Posts by Antti-Ville Suni
AMD Inference Microservice (AIM): Production Ready Inference on AMD Instinct™ GPUs
- 17 November 2025
- Andy Allred ,William Anzen ,Alexander Aurell ,Aravind Kumar Rao Bappanadu ,Thomas Bergstrom ,Sander Bijl de Vroe ,Marc Dillon ,Stanislau Fink ,Alexander Finn ,Mark van Heeswijk ,Andrey Ivannikov ,Teemu Karkkainen ,Shashank Kashyap ,Juho Kerttula ,Miikael Leskinen ,Hari Nair ,Mika Ranta ,Mario Reiser ,Alex Saliniemi ,Rui Sampaio ,Harry Souris ,Antti-Ville Suni ,Robert Talling ,Juho Vainio ,Mikko Vilenius ,Yu Wang ,Bo Zhang
- English
- Applications & models
- AI/MLGenAILLMPerformanceServingKubernetes
As generative AI models continue to expand in scale, context length, andoperational complexity, enterprises face a harder challenge: how todeploy and operate inference reliably, efficiently, and at productionscale. Running LLMs or multimodal models on real workloads requires morethan high-performance GPUs. It requires reproducible deployments,predictable performance, seamless orchestration, and an operationalframework that teams can trust.