Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

OpenAI Guardrails - Python

License

NotificationsYou must be signed in to change notification settings

openai/openai-guardrails-python

Repository files navigation

This is the Python version of OpenAI Guardrails, a package for adding configurable safety and compliance guardrails to LLM applications. It provides a drop-in wrapper for OpenAI's Python client, enabling automatic input/output validation and moderation using a wide range of guardrails.

Most users can simply follow the guided configuration and installation instructions atguardrails.openai.com.

Installation

You can downloadopenai-guardrails package this way:

pip install openai-guardrails

Usage

Follow the configuration and installation instructions atguardrails.openai.com.

Local Development

Clone the repository and install locally:

# Clone the repositorygit clone https://github.com/openai/openai-guardrails-python.gitcd openai-guardrails-python# Install the package (editable), plus example extras if desiredpip install -e.pip install -e".[examples]"

Integration Details

Drop-in OpenAI Replacement

The easiest way to use Guardrails Python is as a drop-in replacement for the OpenAI client:

frompathlibimportPathfromguardrailsimportGuardrailsOpenAI,GuardrailTripwireTriggered# Use GuardrailsOpenAI instead of OpenAIclient=GuardrailsOpenAI(config=Path("guardrail_config.json"))try:# Works with standard Chat Completionschat=client.chat.completions.create(model="gpt-5",messages=[{"role":"user","content":"Hello world"}],    )print(chat.llm_response.choices[0].message.content)# Or with the Responses APIresp=client.responses.create(model="gpt-5",input="What are the main features of your premium plan?",    )print(resp.llm_response.output_text)exceptGuardrailTripwireTriggeredase:print(f"Guardrail triggered:{e}")

Agents SDK Integration

You can integrate guardrails with the OpenAI Agents SDK viaGuardrailAgent:

importasynciofrompathlibimportPathfromagentsimportInputGuardrailTripwireTriggered,OutputGuardrailTripwireTriggered,Runnerfromagents.runimportRunConfigfromguardrailsimportGuardrailAgent# Create agent with guardrails automatically configuredagent=GuardrailAgent(config=Path("guardrails_config.json"),name="Customer support agent",instructions="You are a customer support agent. You help customers with their questions.",)asyncdefmain():try:result=awaitRunner.run(agent,"Hello, can you help me?",run_config=RunConfig(tracing_disabled=True))print(result.final_output)except (InputGuardrailTripwireTriggered,OutputGuardrailTripwireTriggered):print("🛑 Guardrail triggered!")if__name__=="__main__":asyncio.run(main())

For more details, seedocs/agents_sdk_integration.md.

Evaluation Framework

Evaluate guardrail performance on labeled datasets and run benchmarks.

Running Evaluations

# Basic evaluationpython -m guardrails.evals.guardrail_evals \  --config-path guardrails_config.json \  --dataset-path data.jsonl# Benchmark mode (compare models, generate ROC curves, latency)python -m guardrails.evals.guardrail_evals \  --config-path guardrails_config.json \  --dataset-path data.jsonl \  --mode benchmark \  --models gpt-5 gpt-5-mini gpt-4.1-mini

Dataset Format

Datasets must be in JSONL format, with each line containing a JSON object:

{"id":"sample_1","data":"Text or conversation to evaluate","expected_triggers": {"Moderation":true,"NSFW Text":false  }}

Programmatic Usage

frompathlibimportPathfromguardrails.evals.guardrail_evalsimportGuardrailEvaleval=GuardrailEval(config_path=Path("guardrails_config.json"),dataset_path=Path("data.jsonl"),batch_size=32,output_dir=Path("results"),)importasyncioasyncio.run(eval.run())

Project Structure

  • src/guardrails/ - Python source code
  • src/guardrails/checks/ - Built-in guardrail checks
  • src/guardrails/evals/ - Evaluation framework
  • examples/ - Example usage and sample configs

Examples

The package includes examples in theexamples/ directory:

  • examples/basic/hello_world.py — Basic chatbot with guardrails usingGuardrailsOpenAI
  • examples/basic/agents_sdk.py — Agents SDK integration withGuardrailAgent
  • examples/basic/local_model.py — Using local models with guardrails
  • examples/basic/structured_outputs_example.py — Structured outputs
  • examples/basic/pii_mask_example.py — PII masking
  • examples/basic/suppress_tripwire.py — Handling violations gracefully

Running Examples

Prerequisites

pip install -e.pip install"openai-guardrails[examples]"

Run

python examples/basic/hello_world.pypython examples/basic/agents_sdk.py

Available Guardrails

The Python implementation includes the following built-in guardrails:

  • Moderation: Content moderation using OpenAI's moderation API
  • URL Filter: URL filtering and domain allowlist/blocklist
  • Contains PII: Personally Identifiable Information detection
  • Hallucination Detection: Detects hallucinated content using vector stores
  • Jailbreak: Detects jailbreak attempts
  • NSFW Text: Detects workplace-inappropriate content in model outputs
  • Off Topic Prompts: Ensures responses stay within business scope
  • Custom Prompt Check: Custom LLM-based guardrails

For full details, advanced usage, and API reference, see:OpenAI Guardrails Documentation.

License

MIT License - see LICENSE file for details.

Disclaimers

Please note that Guardrails may use Third-Party Services such as thePresidio open-source framework, which are subject to their own terms and conditions and are not developed or verified by OpenAI.

Developers are responsible for implementing appropriate safeguards to prevent storage or misuse of sensitive or prohibited content (including but not limited to personal data, child sexual abuse material, or other illegal content). OpenAI disclaims liability for any logging or retention of such content by developers. Developers must ensure their systems comply with all applicable data protection and content safety laws, and should avoid persisting any blocked content generated or intercepted by Guardrails. Guardrails calls paid OpenAI APIs, and developers are responsible for associated charges.


[8]ページ先頭

©2009-2025 Movatter.jp