- Notifications
You must be signed in to change notification settings - Fork746
Add docs for relay [WIP]#5279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Pull request overview
This WIP (Work in Progress) pull request adds comprehensive documentation for TensorZero's gateway relay feature, which enables organizations to centralize authentication, rate limits, and credentials management across multiple gateway deployments.
Key Changes:
- New documentation guide explaining the two-tier gateway architecture (edge and relay gateways)
- Configuration examples for both edge and relay gateways
- Docker Compose setup demonstrating the relay pattern
- Integration of the new documentation into the docs navigation
Reviewed changes
Copilot reviewed 4 out of 6 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/operations/centralize-auth-rate-limits-and-more.mdx | Main documentation file explaining relay feature, setup instructions, and advanced configurations |
| examples/docs/guides/operations/centralize-auth-rate-limits-and-more/simple/edge-config/tensorzero.toml | Configuration file for edge gateway showing relay setup |
| examples/docs/guides/operations/centralize-auth-rate-limits-and-more/simple/docker-compose.yml | Docker Compose configuration demonstrating two-tier deployment |
| docs/operations/centralize-auth-rate-limits-and-more-diagram.png | Diagram illustrating the relay architecture (binary file) |
| docs/docs.json | Navigation update to include the new relay documentation |
| curl -X POST "http://localhost:3000/inference" \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{ | ||
| "model_name": "openai::gpt-5-mini", |
CopilotAIDec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The model name in this example appears to be incorrect. OpenAI does not have a model called "gpt-5-mini" or "gpt-5". The correct model name should be something like "gpt-4o-mini" based on currently available OpenAI models. Please verify and use a valid OpenAI model name.
| "model_name": "openai::gpt-5-mini", | |
| "model_name": "openai::gpt-4o-mini", |
| - ./edge-config:/app/config:ro | ||
| command: --config-file /app/config/tensorzero.toml | ||
| ports: | ||
| - "3000:3000" |
CopilotAIDec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The edge-gateway service configuration is missing the OPENAI_API_KEY environment variable. Since line 11 sets it to "dummy", but the relay gateway actually has the real API key, this configuration should work. However, this could be confusing because the edge gateway doesn't actually need this credential when using relay. Consider adding a comment explaining why the dummy value is there, or removing it entirely if it's not needed.
| - "3000:3000" | |
| - "3000:3000" | |
| environment: | |
| #The edge gateway forwards requests to the relay and does not use this key directly. | |
| #We set a dummy value here to satisfy tooling that expects OPENAI_API_KEY to be present. | |
| - OPENAI_API_KEY=dummy |
| <Step title="Configure your edge gateway"> | ||
| Configure the edge gateway route inference requests to the relay gateway: |
CopilotAIDec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Consider changing "route" to "to route" for grammatical correctness.
| Configure the edge gateway route inference requests to the relay gateway: | |
| Configure the edge gatewaytoroute inference requests to the relay gateway: |
| ### Bypass the relay for specific requests | ||
| When a relay gateway is configured, the edge gateway will route every inference request through it by default. | ||
| However, you may want to bypass the relay for in some scenarios. |
CopilotAIDec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Minor typo: "in some scenarios" should read "in some scenarios" (missing "for in" should be just "in").
| However, you may want to bypass the relayforin some scenarios. | |
| However, you may want to bypass the relay in some scenarios. |
| [models.my_local_model.providers.openai] | ||
| type = "openai" | ||
| model_name = "openai::gpt-5" |
CopilotAIDec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The model name "gpt-5" in this example is incorrect. OpenAI does not have a "gpt-5" model. Please use a valid OpenAI model name like "gpt-4o" or "gpt-4o-mini".
| model_name = "openai::gpt-5" | |
| model_name = "openai::gpt-4o" |
| routing = ["openai"] | ||
| skip_relay = true | ||
| [models.my_local_model.providers.openai] |
CopilotAIDec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The section header under [models.my_local_model.providers.openai] is confusing. The model is named "my_local_model" but it's using an OpenAI provider, which is not local. This inconsistency in naming could confuse readers. Consider renaming to something like "my_direct_openai_model" or adjusting the example to better reflect the purpose.
| [models.my_local_model.providers.openai] | |
| [models.gpt_5_edge.providers.openai] |
| ### Set up dynamic credentials for the relay gateway | ||
| TODO |
CopilotAIDec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The section "Set up dynamic credentials for the relay gateway" is marked as "TODO" without any content. Since this is a WIP PR, this is expected, but consider either adding the content before merging or removing this section heading entirely to avoid confusing readers about an incomplete feature.
| ###Set up dynamic credentials for the relay gateway | |
| TODO |
| <Accordion title="Sample Output"> | ||
| ```python |
CopilotAIDec 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The language specified for the code block is "python" but the content is JSON. This should be changed to "json" for proper syntax highlighting.
| ```python | |
| ```json |
Uh oh!
There was an error while loading.Please reload this page.
Blocked by#5277.
Important
Add documentation and examples for setting up a relay gateway to centralize auth, rate limits, and credentials in TensorZero.
centralize-auth-rate-limits-and-more.mdxtodocs/operationsfor setting up a relay gateway to centralize auth, rate limits, and credentials.docs.jsonto include the new documentation page in the operations section.docker-compose.ymlandtensorzero.tomlinexamples/docs/guides/operations/centralize-auth-rate-limits-and-more/simplefor deploying edge and relay gateways.This description was created by
for795002b. You cancustomize this summary. It will automatically update as commits are pushed.