Azure-Samples/AI-GatewayPublic

NotificationsYou must be signed in to change notification settings
Fork184
Star539

APIM ❤️ AI - This repo contains experiments on Azure API Management's AI capabilities, integrating with Azure OpenAI, AI Foundry, and much more 🚀

License

MIT license

539 stars 184 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 301 Commits
.devcontainer		.devcontainer
.github		.github
.infracost		.infracost
.vscode		.vscode
images		images
labs		labs
modules		modules
shared		shared
tools		tools
.editorconfig		.editorconfig
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
AI-GATEWAY.md		AI-GATEWAY.md
AI-GATEWAY.pptx		AI-GATEWAY.pptx
AI-Gateway.sln		AI-Gateway.sln
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt

Repository files navigation

🧪AI Gateway Labs withAzure API Management

What's new ✨

➕Realtime API (Audio and Text) with Azure OpenAI 🔥 experiments with theAOAI Realtime
➕Realtime API (Audio and Text) with Azure OpenAI + MCP tools 🔥 experiments with theAOAI Realtime + MCP
➕Model Context Protocol (MCP) ⚙️ experiments with theclient authorization flow
➕ theFinOps Framework lab to manage AI budgets effectively 💰
➕Agentic ✨ experiments withModel Context Protocol (MCP).
➕Agentic ✨ experiments withOpenAI Agents SDK.
➕Agentic ✨ experiments withAI Agent Service fromAzure AI Foundry.
➕ theAI Foundry Deepseek lab with Deepseek R1 model fromAzure AI Foundry.
➕ theZero-to-Production lab with an iterative policy exploration to fine-tune the optimal production configuration.
➕ theTerraform flavor of backend pool load balancing lab.
➕ theAI Foundry SDK lab.
➕ theContent filtering andPrompt shielding labs.
➕ theModel routing lab with OpenAI model based routing.
➕ thePrompt flow lab to try theAzure AI Studio Prompt Flow with Azure API Management.
➕priority andweight parameters to theBackend pool load balancing lab.
➕ theStreaming tool to test OpenAI streaming with Azure API Management.
➕ theTracing tool to debug and troubleshoot OpenAI APIs usingAzure API Management tracing capability.
➕ image processing to theGPT-4o inferencing lab.
➕ theFunction calling lab with a sample API on Azure Functions.

The rapid pace of AI advances demands experimentation-driven approaches for organizations to remain at the forefront of the industry. With AI steadily becoming a game-changer for an array of sectors, maintaining a fast-paced innovation trajectory is crucial for businesses aiming to leverage its full potential.

AI services are predominantly accessed viaAPIs, underscoring the essential need for a robust and efficient API management strategy. This strategy is instrumental for maintaining control and governance over the consumption ofAI services.

With the expanding horizons ofAI services and their seamless integration withAPIs, there is a considerable demand for a comprehensiveAI Gateway pattern, which broadens the core principles of API management. Aiming to accelerate the experimentation of advanced use cases and pave the road for further innovation in this rapidly evolving field. The well-architected principles of theAI Gateway provides a framework for the confident deployment ofIntelligent Apps into production.

🧠 GenAI Gateway

This repo explores theAI Gateway pattern through a series of experimental labs. TheGenAI Gateway capabilities ofAzure API Management plays a crucial role within these labs, handling AI services APIs, with security, reliability, performance, overall operational efficiency and cost controls. The primary focus is onAzure OpenAI, which sets the standard reference for Large Language Models (LLM). However, the same principles and design patterns could potentially be applied to any LLM.

Acknowledging the rising dominance of Python, particularly in the realm of AI, along with the powerful experimental capabilities of Jupyter notebooks, the following labs are structured around Jupyter notebooks, with step-by-step instructions with Python scripts,Bicep files andAzure API Management policies:

🧪 Labs with AI Agents

🧪 MCP Client Authorization

Playground to experiment theModel Context Protocol with theclient authorization flow. In this flow, Azure API Management act both as an OAuth client connecting to theMicrosoft Entra ID authorization server and as an OAuth authorization server for the MCP client (MCP inspector in this lab).

Lab	Security	Reliability	Performance	Operations	Costs
Request forwarding	⭐
Backend circuit breaking	⭐	⭐
Backend pool load balancing	⭐	⭐	⭐
Advanced load balancing	⭐	⭐	⭐
Response streaming	⭐		⭐
Vector searching	⭐	⭐	⭐
Built-in logging	⭐	⭐	⭐	⭐	⭐
SLM self-hosting	⭐		⭐

Movatterモバイル変換

License

Azure-Samples/AI-Gateway

Folders and files

Latest commit

History

Repository files navigation

🧪AI Gateway Labs withAzure API Management

What's new ✨

Contents

🧠 GenAI Gateway

🧪 Labs with AI Agents

🧪 Labs with the Inference API

🧪 SLM self-hosting (Phi-3)

🧪 Labs based on Azure OpenAI

🧪 Backend pool load balancing - Available withBicep andTerraform

Backlog of Labs

🚀 Getting Started

Prerequisites

Quickstart

⛵ Roll-out to production

🔨 Supporting Tools

🏛️ Well-Architected Framework

🎒 Show and tell

🥇 Other resources

🌐 WW GBB initiative

Disclaimer

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages