bytedance/deer-flowPublic

NotificationsYou must be signed in to change notification settings
Fork2.3k
Star18.7k

DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.

License

MIT license

18.7k stars 2.3k forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 543 Commits
.github		.github
.vscode		.vscode
assets		assets
docs		docs
examples		examples
src		src
tests		tests
web		web
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Agent.md		Agent.md
CONTRIBUTING		CONTRIBUTING
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_de.md		README_de.md
README_es.md		README_es.md
README_ja.md		README_ja.md
README_pt.md		README_pt.md
README_ru.md		README_ru.md
README_zh.md		README_zh.md
bootstrap.bat		bootstrap.bat
bootstrap.sh		bootstrap.sh
conf.yaml.example		conf.yaml.example
docker-compose.yml		docker-compose.yml
langgraph.json		langgraph.json
main.py		main.py
pre-commit		pre-commit
pyproject.toml		pyproject.toml
server.py		server.py
test_fix.py		test_fix.py
uv.lock		uv.lock

Repository files navigation

🦌 DeerFlow

Originated from Open Source, give back to Open Source.

DeerFlow (DeepExploration andEfficientResearchFlow) is a community-driven Deep Research framework that builds upon the incredible work of the open source community. Our goal is to combine language models with specialized tools for tasks like web search, crawling, and Python code execution, while giving back to the community that made this possible.

Currently, DeerFlow has officially entered theFaaS Application Center of Volcengine. Users can experience it online through theexperience link to intuitively feel its powerful functions and convenient operations. At the same time, to meet the deployment needs of different users, DeerFlow supports one-click deployment based on Volcengine. Click thedeployment link to quickly complete the deployment process and start an efficient research journey.

DeerFlow has newly integrated the intelligent search and crawling toolset independently developed by BytePlus--InfoQuest (supports free online experience)

Please visitour official website for more details.

Demo

Video

deer-flow.mp4

In this demo, we showcase how to use DeerFlow to:

Seamlessly integrate with MCP services
Conduct the Deep Research process and produce a comprehensive report with images
Create podcast audio based on the generated report

Replays

📑 Table of Contents

Quick Start

DeerFlow is developed in Python, and comes with a web UI written in Node.js. To ensure a smooth setup process, we recommend using the following tools:

Recommended Tools

uv:Simplify Python environment and dependency management.uv automatically creates a virtual environment in the root directory and installs all required packages for you—no need to manually install Python environments.
nvm:Manage multiple versions of the Node.js runtime effortlessly.
pnpm:Install and manage dependencies of Node.js project.

Environment Requirements

Make sure your system meets the following minimum requirements:

Python: Version3.12+
Node.js: Version22+

Installation

# Clone the repositorygit clone https://github.com/bytedance/deer-flow.gitcd deer-flow# Install dependencies, uv will take care of the python interpreter and venv creation, and install the required packagesuv sync# Configure .env with your API keys# Tavily: https://app.tavily.com/home# Brave_SEARCH: https://brave.com/search/api/# volcengine TTS: Add your TTS credentials if you have themcp .env.example .env# See the 'Supported Search Engines' and 'Text-to-Speech Integration' sections below for all available options# Configure conf.yaml for your LLM model and API keys# Please refer to 'docs/configuration_guide.md' for more details# For local development, you can use Ollama or other local modelscp conf.yaml.example conf.yaml# Install marp for ppt generation# https://github.com/marp-team/marp-cli?tab=readme-ov-file#use-package-managerbrew install marp-cli

Optionally, install web UI dependencies viapnpm:

cd deer-flow/webpnpm install

Configurations

Please refer to theConfiguration Guide for more details.

Note

Before you start the project, read the guide carefully, and update the configurations to match your specific settings and requirements.

Console UI

The quickest way to run the project is to use the console UI.

# Run the project in a bash-like shelluv run main.py

Web UI

This project also includes a Web UI, offering a more dynamic and engaging interactive experience.

Note

You need to install the dependencies of web UI first.

# Run both the backend and frontend servers in development mode# On macOS/Linux./bootstrap.sh -d# On Windowsbootstrap.bat -d

Note

By default, the backend server binds to 127.0.0.1 (localhost) for security reasons. If you need to allow external connections (e.g., when deploying on Linux server), you can modify the server host to 0.0.0.0 in the bootstrap script(uv run server.py --host 0.0.0.0).Please ensure your environment is properly secured before exposing the service to external networks.

Open your browser and visithttp://localhost:3000 to explore the web UI.

Explore more details in theweb directory.

Supported Search Engines

Web Search

DeerFlow supports multiple search engines that can be configured in your.env file using theSEARCH_API variable:

Tavily (default): A specialized search API for AI applications
- RequiresTAVILY_API_KEY in your.env file
- Sign up at:https://app.tavily.com/home
InfoQuest (recommended): AI-optimized intelligent search and crawling toolset independently developed by BytePlus
- RequiresINFOQUEST_API_KEY in your.env file
- Support for time range filtering and site filtering
- Provides high-quality search results and content extraction
- Sign up at:https://console.byteplus.com/infoquest/infoquests
- Visithttps://docs.byteplus.com/en/docs/InfoQuest/What_is_Info_Quest to learn more
DuckDuckGo: Privacy-focused search engine
- No API key required
Brave Search: Privacy-focused search engine with advanced features
- RequiresBRAVE_SEARCH_API_KEY in your.env file
- Sign up at:https://brave.com/search/api/
Arxiv: Scientific paper search for academic research
- No API key required
- Specialized for scientific and academic papers
Searx/SearxNG: Self-hosted metasearch engine
- RequiresSEARX_HOST to be set in the.env file
- Supports connecting to either Searx or SearxNG

To configure your preferred search engine, set theSEARCH_API variable in your.env file:

# Choose one: tavily, infoquest, duckduckgo, brave_search, arxivSEARCH_API=tavily

Crawling Tools

DeerFlow supports multiple crawling tools that can be configured in yourconf.yaml file:

Jina (default): Freely accessible web content crawling tool
InfoQuest (recommended): AI-optimized intelligent search and crawling toolset developed by BytePlus
- RequiresINFOQUEST_API_KEY in your.env file
- Provides configurable crawling parameters
- Supports custom timeout settings
- Offers more powerful content extraction capabilities
- Visithttps://docs.byteplus.com/en/docs/InfoQuest/What_is_Info_Quest to learn more

To configure your preferred crawling tool, set the following in yourconf.yaml file:

CRAWLER_ENGINE:# Engine type: "jina" (default) or "infoquest"engine:infoquest

Private Knowledgebase

DeerFlow supports private knowledgebase such as RAGFlow, Qdrant, Milvus, and VikingDB, so that you can use your private documents to answer questions.

RAGFlow: open source RAG engine

# examples in .env.exampleRAG_PROVIDER=ragflowRAGFLOW_API_URL="http://localhost:9388"RAGFLOW_API_KEY="ragflow-xxx"RAGFLOW_RETRIEVAL_SIZE=10RAGFLOW_CROSS_LANGUAGES=English,Chinese,Spanish,French,German,Japanese,Korean

Qdrant: open source vector database

# Using Qdrant Cloud or self-hostedRAG_PROVIDER=qdrantQDRANT_LOCATION=https://xyz-example.eu-central.aws.cloud.qdrant.io:6333QDRANT_API_KEY=your_qdrant_api_keyQDRANT_COLLECTION=documentsQDRANT_EMBEDDING_PROVIDER=openaiQDRANT_EMBEDDING_MODEL=text-embedding-ada-002QDRANT_EMBEDDING_API_KEY=your_openai_api_keyQDRANT_AUTO_LOAD_EXAMPLES=true

Features

Core Capabilities

🤖LLM Integration
- It supports the integration of most models throughlitellm.
- Support for open source models like Qwen, you need to read theconfiguration for more details.
- OpenAI-compatible API interface
- Multi-tier LLM system for different task complexities

Tools and MCP Integrations

🔍Search and Retrieval
- Web search via Tavily, InfoQuest, Brave Search and more
- Crawling with Jina and InfoQuest
- Advanced content extraction
- Support for private knowledgebase
📃RAG Integration
- Supports multiple vector databases:Qdrant,Milvus,RAGFlow, VikingDB, MOI, and Dify
- Supports mentioning files from RAG providers within the input box
- Easy switching between different vector databases through configuration
🔗MCP Seamless Integration
- Expand capabilities for private domain access, knowledge graph, web browsing and more
- Facilitates integration of diverse research tools and methodologies

Human Collaboration

💬Intelligent Clarification Feature
- Multi-turn dialogue to clarify vague research topics
- Improve research precision and report quality
- Reduce ineffective searches and token usage
- Configurable switch for flexible enable/disable control
- SeeConfiguration Guide - Clarification for details
🧠Human-in-the-loop
- Supports interactive modification of research plans using natural language
- Supports auto-acceptance of research plans
📝Report Post-Editing
- Supports Notion-like block editing
- Allows AI refinements, including AI-assisted polishing, sentence shortening, and expansion
- Powered bytiptap

Content Creation

🎙️Podcast and Presentation Generation
- AI-powered podcast script generation and audio synthesis
- Automated creation of simple PowerPoint presentations
- Customizable templates for tailored content

Architecture

DeerFlow implements a modular multi-agent system architecture designed for automated research and code analysis. The system is built on LangGraph, enabling a flexible state-based workflow where components communicate through a well-defined message passing system.

See it live atdeerflow.tech

The system employs a streamlined workflow with the following components:

Coordinator: The entry point that manages the workflow lifecycle
- Initiates the research process based on user input
- Delegates tasks to the planner when appropriate
- Acts as the primary interface between the user and the system
Planner: Strategic component for task decomposition and planning
- Analyzes research objectives and creates structured execution plans
- Determines if enough context is available or if more research is needed
- Manages the research flow and decides when to generate the final report
Research Team: A collection of specialized agents that execute the plan:
- Researcher: Conducts web searches and information gathering using tools like web search engines, crawling and even MCP services.
- Coder: Handles code analysis, execution, and technical tasks using Python REPL tool.Each agent has access to specific tools optimized for their role and operates within the LangGraph framework
Reporter: Final stage processor for research outputs
- Aggregates findings from the research team
- Processes and structures the collected information
- Generates comprehensive research reports

Text-to-Speech Integration

DeerFlow now includes a Text-to-Speech (TTS) feature that allows you to convert research reports to speech. This feature uses the volcengine TTS API to generate high-quality audio from text. Features like speed, volume, and pitch are also customizable.

Using the TTS API

You can access the TTS functionality through the/api/tts endpoint:

# Example API call using curlcurl --location'http://localhost:8000/api/tts' \--header'Content-Type: application/json' \--data'{    "text": "This is a test of the text-to-speech functionality.",    "speed_ratio": 1.0,    "volume_ratio": 1.0,    "pitch_ratio": 1.0}' \--output speech.mp3

Development

Testing

Install development dependencies:

uv pip install -e".[test]"

Run the test suite:

# Run all testsmaketest# Run specific test filepytest tests/integration/test_workflow.py# Run with coveragemake coverage

Code Quality

# Run lintingmake lint# Format codemake format

Debugging with LangGraph Studio

DeerFlow uses LangGraph for its workflow architecture. You can use LangGraph Studio to debug and visualize the workflow in real-time.

Running LangGraph Studio Locally

DeerFlow includes alanggraph.json configuration file that defines the graph structure and dependencies for the LangGraph Studio. This file points to the workflow graphs defined in the project and automatically loads environment variables from the.env file.

Mac

# Install uv package manager if you don't have itcurl -LsSf https://astral.sh/uv/install.sh| sh# Install dependencies and start the LangGraph serveruvx --refresh --from"langgraph-cli[inmem]" --with-editable. --python 3.12 langgraph dev --allow-blocking

Windows / Linux

# Install dependenciespip install -e.pip install -U"langgraph-cli[inmem]"# Start the LangGraph serverlanggraph dev

After starting the LangGraph server, you'll see several URLs in the terminal:

API:http://127.0.0.1:2024
Studio UI:https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024
API Docs:http://127.0.0.1:2024/docs

Open the Studio UI link in your browser to access the debugging interface.

Using LangGraph Studio

In the Studio UI, you can:

Visualize the workflow graph and see how components connect
Trace execution in real-time to see how data flows through the system
Inspect the state at each step of the workflow
Debug issues by examining inputs and outputs of each component
Provide feedback during the planning phase to refine research plans

When you submit a research topic in the Studio UI, you'll be able to see the entire workflow execution, including:

The planning phase where the research plan is created
The feedback loop where you can modify the plan
The research and writing phases for each section
The final report generation

Enabling LangSmith Tracing

DeerFlow supports LangSmith tracing to help you debug and monitor your workflows. To enable LangSmith tracing:

Make sure your.env file has the following configurations (see.env.example):

LANGSMITH_TRACING=trueLANGSMITH_ENDPOINT="https://api.smith.langchain.com"LANGSMITH_API_KEY="xxx"LANGSMITH_PROJECT="xxx"

Start tracing and visualize the graph locally with LangSmith by running:
```
langgraph dev
```

This will enable trace visualization in LangGraph Studio and send your traces to LangSmith for monitoring and analysis.

Checkpointing

Postgres and MonogDB implementation of LangGraph checkpoint saver.
In-memory store is used to caching the streaming messages before persisting to database, If finish_reason is "stop" or "interrupt", it triggers persistence.
Supports saving and loading checkpoints for workflow execution.
Supports saving chat stream events for replaying conversations.

Note: About langgraph issue #5557The latest langgraph-checkpoint-postgres-2.0.23 have checkpointing issue, you can check the open issue:"TypeError: Object of type HumanMessage is not JSON serializable" [langchain-ai/langgraph#5557].

To use postgres checkpoint you should install langgraph-checkpoint-postgres-2.0.21

Note: About psycopg dependenciesPlease read the following document before using postgres:https://www.psycopg.org/psycopg3/docs/basic/install.html

BY default, psycopg needs libpq to be installed on your system. If you don't have libpq installed, you can install psycopg with thebinary extra to include a statically linked version of libpq mannually:

pip install psycopg[binary]

This will install a self-contained package with all the libraries needed, but binary not supported for all platform, you check the supported platform :https://pypi.org/project/psycopg-binary/#files

if not supported, you can select local-installation:https://www.psycopg.org/psycopg3/docs/basic/install.html#local-installation

The default database and collection will be automatically created if not exists.Default database: checkpoing_dbDefault collection: checkpoint_writes_aio (langgraph checkpoint writes)Default collection: checkpoints_aio (langgraph checkpoints)Default collection: chat_streams (chat stream events for replaying conversations)

You need to set the following environment variables in your.env file:

# Enable LangGraph checkpoint saver, supports MongoDB, PostgresLANGGRAPH_CHECKPOINT_SAVER=true# Set the database URL for saving checkpointsLANGGRAPH_CHECKPOINT_DB_URL="mongodb://localhost:27017/"#LANGGRAPH_CHECKPOINT_DB_URL=postgresql://localhost:5432/postgres

Docker

You can also run this project with Docker.

First, you need read theconfiguration below. Make sure.env,.conf.yaml files are ready.

Second, to build a Docker image of your own web server:

docker build -t deer-flow-api.

Final, start up a docker container running the web server:

# Replace deer-flow-api-app with your preferred container name# Start the server then bind to localhost:8000docker run -d -t -p 127.0.0.1:8000:8000 --env-file .env --name deer-flow-api-app deer-flow-api# stop the serverdocker stop deer-flow-api-app

Docker Compose (include both backend and frontend)

DeerFlow provides a docker-compose setup to easily run both the backend and frontend together:

# building docker imagedocker compose build# start the serverdocker compose up

Warning

If you want to deploy the deer flow into production environments, please add authentication to the website and evaluate your security check of the MCPServer and Python Repl.

Examples

The following examples demonstrate the capabilities of DeerFlow:

Research Reports

OpenAI Sora Report - Analysis of OpenAI's Sora AI tool
- Discusses features, access, prompt engineering, limitations, and ethical considerations
- View full report
Google's Agent to Agent Protocol Report - Overview of Google's Agent to Agent (A2A) protocol
- Discusses its role in AI agent communication and its relationship with Anthropic's Model Context Protocol (MCP)
- View full report
What is MCP? - A comprehensive analysis of the term "MCP" across multiple contexts
- Explores Model Context Protocol in AI, Monocalcium Phosphate in chemistry, and Micro-channel Plate in electronics
- View full report
Bitcoin Price Fluctuations - Analysis of recent Bitcoin price movements
- Examines market trends, regulatory influences, and technical indicators
- Provides recommendations based on historical data
- View full report
What is LLM? - An in-depth exploration of Large Language Models
- Discusses architecture, training, applications, and ethical considerations
- View full report
How to Use Claude for Deep Research? - Best practices and workflows for using Claude in deep research
- Covers prompt engineering, data analysis, and integration with other tools
- View full report
AI Adoption in Healthcare: Influencing Factors - Analysis of factors driving AI adoption in healthcare
- Discusses AI technologies, data quality, ethical considerations, economic evaluations, organizational readiness, and digital infrastructure
- View full report
Quantum Computing Impact on Cryptography - Analysis of quantum computing's impact on cryptography
- Discusses vulnerabilities of classical cryptography, post-quantum cryptography, and quantum-resistant cryptographic solutions
- View full report
Cristiano Ronaldo's Performance Highlights - Analysis of Cristiano Ronaldo's performance highlights
- Discusses his career achievements, international goals, and performance in various matches
- View full report

To run these examples or create your own research reports, you can use the following commands:

# Run with a specific queryuv run main.py"What factors are influencing AI adoption in healthcare?"# Run with custom planning parametersuv run main.py --max_plan_iterations 3"How does quantum computing impact cryptography?"# Run in interactive mode with built-in questionsuv run main.py --interactive# Or run with basic interactive promptuv run main.py# View all available optionsuv run main.py --help

Interactive Mode

The application now supports an interactive mode with built-in questions in both English and Chinese:

Launch the interactive mode:
```
uv run main.py --interactive
```
Select your preferred language (English or 中文)
Choose from a list of built-in questions or select the option to ask your own question
The system will process your question and generate a comprehensive research report

Human in the Loop

DeerFlow includes a human in the loop mechanism that allows you to review, edit, and approve research plans before they are executed:

Plan Review: When human in the loop is enabled, the system will present the generated research plan for your review before execution
Providing Feedback: You can:
- Accept the plan by responding with[ACCEPTED]
- Edit the plan by providing feedback (e.g.,[EDIT PLAN] Add more steps about technical implementation)
- The system will incorporate your feedback and generate a revised plan
Auto-acceptance: You can enable auto-acceptance to skip the review process:
- Via API: Setauto_accepted_plan: true in your request

API Integration: When using the API, you can provide feedback through thefeedback parameter:

{"messages": [{"role":"user","content":"What is quantum computing?" }],"thread_id":"my_thread_id","auto_accepted_plan":false,"feedback":"[EDIT PLAN] Include more about quantum algorithms"}

Command Line Arguments

The application supports several command-line arguments to customize its behavior:

query: The research query to process (can be multiple words)
--interactive: Run in interactive mode with built-in questions
--max_plan_iterations: Maximum number of planning cycles (default: 1)
--max_step_num: Maximum number of steps in a research plan (default: 3)
--debug: Enable detailed debug logging

FAQ

Please refer to theFAQ.md for more details.

License

This project is open source and available under theMIT License.

Acknowledgments

DeerFlow is built upon the incredible work of the open-source community. We are deeply grateful to all the projects and contributors whose efforts have made DeerFlow possible. Truly, we stand on the shoulders of giants.

We would like to extend our sincere appreciation to the following projects for their invaluable contributions:

LangChain: Their exceptional framework powers our LLM interactions and chains, enabling seamless integration and functionality.
LangGraph: Their innovative approach to multi-agent orchestration has been instrumental in enabling DeerFlow's sophisticated workflows.
Novel: Their Notion-style WYSIWYG editor supports our report editing and AI-assisted rewriting.
RAGFlow: We have achieved support for research on users' private knowledge bases through integration with RAGFlow.

These projects exemplify the transformative power of open-source collaboration, and we are proud to build upon their foundations.

Key Contributors

A heartfelt thank you goes out to the core authors ofDeerFlow, whose vision, passion, and dedication have brought this project to life:

Daniel Walnut
Henry Li

Your unwavering commitment and expertise have been the driving force behind DeerFlow's success. We are honored to have you at the helm of this journey.

Star History

About

DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.

deerflow.tech

Contributors83

+ 69 contributors

Movatterモバイル変換

License

bytedance/deer-flow

Folders and files

Latest commit

History

Repository files navigation

🦌 DeerFlow

Demo

Video

Replays

📑 Table of Contents

Quick Start

Recommended Tools

Environment Requirements

Installation

Configurations

Console UI

Web UI

Supported Search Engines

Web Search

Crawling Tools

Private Knowledgebase

Features

Core Capabilities

Tools and MCP Integrations

Human Collaboration

Content Creation

Architecture

Text-to-Speech Integration

Using the TTS API

Development

Testing

Code Quality

Debugging with LangGraph Studio

Running LangGraph Studio Locally

Mac

Windows / Linux

Using LangGraph Studio

Enabling LangSmith Tracing

Checkpointing

Docker

Docker Compose (include both backend and frontend)

Examples

Research Reports

Interactive Mode

Human in the Loop

Command Line Arguments

FAQ

License

Acknowledgments

Key Contributors

Star History

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors83

Uh oh!

Languages