- Notifications
You must be signed in to change notification settings - Fork0
"DeepTutor: AI-Powered Personalized Learning Assistant"
License
niyeldeii/DeepTutor
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation

Quick Start ·Core Modules ·FAQ
🇨🇳 中文 ·🇯🇵 日本語 ·🇪🇸 Español ·🇫🇷 Français ·🇸🇦 العربية ·🇷🇺 Русский ·🇮🇳 हिन्दी ·🇵🇹 Português
📚Massive Document Knowledge Q&A • 🎨Interactive Learning Visualization
🎯Knowledge Reinforcement • 🔍Deep Research & Idea Generation
[2026.1.3] Released DeepTutorv0.2.0 - thanks to all the contributors! ❤️
[2026.1.1] Happy New Year! Join ourGitHub Discussions - shape the future of DeepTutor! 💬
[2025.12.30] Visit ourOfficial Website for more details!
[2025.12.29] DeepTutor v0.1 is now live! ✨
•Smart Knowledge Base: Upload textbooks, research papers, technical manuals, and domain-specific documents. Build a comprehensive AI-powered knowledge repository for instant access.
•Multi-Agent Problem Solving: Dual-loop reasoning architecture with RAG, web search, and code execution -- delivering step-by-step solutions with precise citations.
•Knowledge Simplification & Explanations: Transform complex concepts, knowledge, and algorithms into easy-to-understand visual aids, detailed step-by-step breakdowns, and engaging interactive demonstrations.
•Personalized Q&A: Context-aware conversations that adapt to your learning progress, with interactive pages and session-based knowledge tracking.
•Intelligent Exercise Creation: Generate targeted quizzes, practice problems, and customized assessments tailored to your current knowledge level and specific learning objectives.
•Authentic Exam Simulation: Upload reference exams to generate practice questions that perfectly match the original style, format, and difficulty—giving you realistic preparation for the actual test.
•Comprehensive Research & Literature Review: Conduct in-depth topic exploration with systematic analysis. Identify patterns, connect related concepts across disciplines, and synthesize existing research findings.
•Novel Insight Discovery: Generate structured learning materials and uncover knowledge gaps. Identify promising new research directions through intelligent cross-domain knowledge synthesis.
![]() Multi-agent Problem Solving with Exact Citations | ![]() Step-by-step Visual Explanations with Personal QAs. |
![]() Custom Questions | ![]() Mimic Questions |
![]() Personal Knowledge Base | ![]() Personal Notebook |
🌙 Use DeepTutor inDark Mode!
•Intuitive Interaction: Simple bidirectional query-response flow for intuitive interaction.
•Structured Output: Structured response generation that organizes complex information into actionable outputs.
•Problem Solving & Assessment: Step-by-step problem solving and custom assessment generation.
•Research & Learning: Deep Research for topic exploration and Guided Learning with visualization.
•Idea Generation: Automated and interactive concept development with multi-source insights.
•Information Retrieval: RAG hybrid retrieval, real-time web search, and academic paper databases.
•Processing & Analysis: Python code execution, query item lookup, and PDF parsing for document analysis.
•Knowledge Graph: Entity-relation mapping for semantic connections and knowledge discovery.
•Vector Store: Embedding-based semantic search for intelligent content retrieval.
•Memory System: Session state management and citation tracking for contextual continuity.
🌟 Star to follow our future updates!
- Support Local LLM Services (e.g., ollama)
- Refactor RAG Module (seeDiscussions)
- Deep-coding from idea generation
- Personalized Interaction with Notebook
① Clone Repository
git clone https://github.com/HKUDS/DeepTutor.gitcd DeepTutor② Set Up Environment Variables
cp .env.example .env# Edit .env file with your API keys📋Environment Variables Reference
| Variable | Required | Description |
|---|---|---|
LLM_MODEL | Yes | Model name (e.g.,gpt-4o) |
LLM_BINDING_API_KEY | Yes | Your LLM API key |
LLM_BINDING_HOST | Yes | API endpoint URL |
EMBEDDING_MODEL | Yes | Embedding model name |
EMBEDDING_BINDING_API_KEY | Yes | Embedding API key |
EMBEDDING_BINDING_HOST | Yes | Embedding API endpoint |
BACKEND_PORT | No | Backend port (default:8001) |
FRONTEND_PORT | No | Frontend port (default:3782) |
TTS_* | No | Text-to-Speech settings |
PERPLEXITY_API_KEY | No | For web search |
③ Configure Ports & LLM(Optional)
- Ports: Edit
config/main.yaml→server.backend_port/server.frontend_port - LLM: Edit
config/agents.yaml→temperature/max_tokensper module - SeeConfiguration Docs for details
④ Try Demo Knowledge Bases(Optional)
📚Available Demos
- Research Papers — 5 papers from our lab (AI-Researcher,LightRAG, etc.)
- Data Science Textbook — 8 chapters, 296 pages (Book Link)
- Download fromGoogle Drive
- Extract into
data/directory
Demo KBs use
text-embedding-3-largewithdimensions = 3072
⑤ Create Your Own Knowledge Base(After Launch)
- Go tohttp://localhost:3782/knowledge
- Click "New Knowledge Base" → Enter name → Upload PDF/TXT/MD files
- Monitor progress in terminal
Recommended — No Python/Node.js setup Prerequisites:Docker &Docker Compose Quick Start: # Build and start (~5-10 min first run)docker compose up --build -d# View logsdocker compose logs -f Commands: docker compose up -d# Startdocker compose logs -f# Logsdocker compose down# Stopdocker compose up --build# Rebuild
Advanced: # Build custom imagedocker build -t deeptutor:latest.# Run standalonedocker run -p 8001:8001 -p 3782:3782 \ --env-file .env deeptutor:latest | For development or non-Docker environments Prerequisites: Python 3.10+, Node.js 18+ Set Up Environment: # Using conda (Recommended)conda create -n deeptutor python=3.10conda activate deeptutor# Or using venvpython -m venv venvsource venv/bin/activate Install Dependencies: bash scripts/install_all.sh# Or manually:pip install -r requirements.txtnpm install --prefix webLaunch: # Start web interfacepython scripts/start_web.py# Or CLI onlypython scripts/start.py# Stop: Ctrl+C |
| Service | URL | Description |
|---|---|---|
| Frontend | http://localhost:3782 | Main web interface |
| API Docs | http://localhost:8001/docs | Interactive API documentation |
All user content and system data are stored in thedata/ directory:
data/├── knowledge_bases/ # Knowledge base storage└── user/ # User activity data ├── solve/ # Problem solving results and artifacts ├── question/ # Generated questions ├── research/ # Research reports and cache ├── co-writer/ # Interactive IdeaGen documents and audio files ├── notebook/ # Notebook records and metadata ├── guide/ # Guided learning sessions ├── logs/ # System logs └── run_code_workspace/ # Code execution workspaceResults are automatically saved during all activities. Directories are created automatically as needed.
🧠 Smart Solver
Intelligent problem-solving system based onAnalysis Loop + Solve Loop dual-loop architecture, supporting multi-mode reasoning and dynamic knowledge retrieval.
Core Features
| Feature | Description |
|---|---|
| Dual-Loop Architecture | Analysis Loop: InvestigateAgent → NoteAgent Solve Loop: PlanAgent → ManagerAgent → SolveAgent → CheckAgent → Format |
| Multi-Agent Collaboration | Specialized agents: InvestigateAgent, NoteAgent, PlanAgent, ManagerAgent, SolveAgent, CheckAgent |
| Real-time Streaming | WebSocket transmission with live reasoning process display |
| Tool Integration | RAG (naive/hybrid), Web Search, Query Item, Code Execution |
| Persistent Memory | JSON-based memory files for context preservation |
| Citation Management | Structured citations with reference tracking |
Usage
- Visithttp://localhost:{frontend_port}/solver
- Select a knowledge base
- Enter your question, click "Solve"
- Watch the real-time reasoning process and final answer
Python API
importasynciofromsrc.agents.solveimportMainSolverasyncdefmain():solver=MainSolver(kb_name="ai_textbook")result=awaitsolver.solve(question="Calculate the linear convolution of x=[1,2,3] and h=[4,5]",mode="auto" )print(result['formatted_solution'])asyncio.run(main())
Output Location
data/user/solve/solve_YYYYMMDD_HHMMSS/├── investigate_memory.json # Analysis Loop memory├── solve_chain.json # Solve Loop steps & tool records├── citation_memory.json # Citation management├── final_answer.md # Final solution (Markdown)├── performance_report.json # Performance monitoring└── artifacts/ # Code execution outputs📝 Question Generator
Dual-mode question generation system supportingcustom knowledge-based generation andreference exam paper mimicking with automatic validation.
Core Features
| Feature | Description |
|---|---|
| Custom Mode | Background Knowledge →Question Planning →Generation →Single-Pass Validation Analyzes question relevance without rejection logic |
| Mimic Mode | PDF Upload →MinerU Parsing →Question Extraction →Style Mimicking Generates questions based on reference exam structure |
| ReAct Engine | QuestionGenerationAgent with autonomous decision-making (think → act → observe) |
| Validation Analysis | Single-pass relevance analysis withkb_coverage andextension_points |
| Question Types | Multiple choice, fill-in-the-blank, calculation, written response, etc. |
| Batch Generation | Parallel processing with progress tracking |
| Complete Persistence | All intermediate files saved (background knowledge, plan, individual results) |
| Timestamped Output | Mimic mode creates batch folders:mimic_YYYYMMDD_HHMMSS_{pdf_name}/ |
Usage
Custom Mode:
- Visithttp://localhost:{frontend_port}/question
- Fill in requirements (topic, difficulty, question type, count)
- Click "Generate Questions"
- View generated questions with validation reports
Mimic Mode:
- Visithttp://localhost:{frontend_port}/question
- Switch to "Mimic Exam" tab
- Upload PDF or provide parsed exam directory
- Wait for parsing → extraction → generation
- View generated questions alongside original references
Python API
Custom Mode - Full Pipeline:
importasynciofromsrc.agents.questionimportAgentCoordinatorasyncdefmain():coordinator=AgentCoordinator(kb_name="ai_textbook",output_dir="data/user/question" )# Generate multiple questions from text requirementresult=awaitcoordinator.generate_questions_custom(requirement_text="Generate 3 medium-difficulty questions about deep learning basics",difficulty="medium",question_type="choice",count=3 )print(f"✅ Generated{result['completed']}/{result['requested']} questions")forqinresult['results']:print(f"- Relevance:{q['validation']['relevance']}")asyncio.run(main())
Mimic Mode - PDF Upload:
fromsrc.agents.question.tools.exam_mimicimportmimic_exam_questionsresult=awaitmimic_exam_questions(pdf_path="exams/midterm.pdf",kb_name="calculus",output_dir="data/user/question/mimic_papers",max_questions=5)print(f"✅ Generated{result['successful_generations']} questions")print(f"Output:{result['output_file']}")
Output Location
Custom Mode:
data/user/question/custom_YYYYMMDD_HHMMSS/├── background_knowledge.json # RAG retrieval results├── question_plan.json # Question planning├── question_1_result.json # Individual question results├── question_2_result.json└── ...Mimic Mode:
data/user/question/mimic_papers/└── mimic_YYYYMMDD_HHMMSS_{pdf_name}/ ├── {pdf_name}.pdf # Original PDF ├── auto/{pdf_name}.md # MinerU parsed markdown ├── {pdf_name}_YYYYMMDD_HHMMSS_questions.json # Extracted questions └── {pdf_name}_YYYYMMDD_HHMMSS_generated_questions.json # Generated questions🎓 Guided Learning
Personalized learning system based on notebook content, automatically generating progressive learning paths through interactive pages and smart Q&A.
Core Features
| Feature | Description |
|---|---|
| Multi-Agent Architecture | LocateAgent: Identifies 3-5 progressive knowledge points InteractiveAgent: Converts to visual HTML pages ChatAgent: Provides contextual Q&A SummaryAgent: Generates learning summaries |
| Smart Knowledge Location | Automatic analysis of notebook content |
| Interactive Pages | HTML page generation with bug fixing |
| Smart Q&A | Context-aware answers with explanations |
| Progress Tracking | Real-time status with session persistence |
| Cross-Notebook Support | Select records from multiple notebooks |
Usage Flow
- Select Notebook(s) — Choose one or multiple notebooks (cross-notebook selection supported)
- Generate Learning Plan — LocateAgent identifies 3-5 core knowledge points
- Start Learning — InteractiveAgent generates HTML visualization
- Learning Interaction — Ask questions, click "Next" to proceed
- Complete Learning — SummaryAgent generates learning summary
Output Location
data/user/guide/└── session_{session_id}.json # Complete session state, knowledge points, chat history✏️ Interactive IdeaGen (Co-Writer)
Intelligent Markdown editor supporting AI-assisted writing, auto-annotation, and TTS narration.
Core Features
| Feature | Description |
|---|---|
| Rich Text Editing | Full Markdown syntax support with live preview |
| EditAgent | Rewrite: Custom instructions with optional RAG/web context Shorten: Compress while preserving key information Expand: Add details and context |
| Auto-Annotation | Automatic key content identification and marking |
| NarratorAgent | Script generation, TTS audio, multiple voices (Cherry, Stella, Annie, Cally, Eva, Bella) |
| Context Enhancement | Optional RAG or web search for additional context |
| Multi-Format Export | Markdown, PDF, etc. |
Usage
- Visithttp://localhost:{frontend_port}/co_writer
- Enter or paste text in the editor
- Use AI features: Rewrite, Shorten, Expand, Auto Mark, Narrate
- Export to Markdown or PDF
Output Location
data/user/co-writer/├── audio/ # TTS audio files│ └── {operation_id}.mp3├── tool_calls/ # Tool call history│ └── {operation_id}_{tool_type}.json└── history.json # Edit history🔬 Deep Research
DR-in-KG (Deep Research in Knowledge Graph) — A systematic deep research system based onDynamic Topic Queue architecture, enabling multi-agent collaboration across three phases:Planning → Researching → Reporting.
Core Features
| Feature | Description |
|---|---|
| Three-Phase Architecture | Phase 1 (Planning): RephraseAgent (topic optimization) + DecomposeAgent (subtopic decomposition) Phase 2 (Researching): ManagerAgent (queue scheduling) + ResearchAgent (research decisions) + NoteAgent (info compression) Phase 3 (Reporting): Deduplication → Three-level outline generation → Report writing with citations |
| Dynamic Topic Queue | Core scheduling system with TopicBlock state management:PENDING → RESEARCHING → COMPLETED/FAILED. Supports dynamic topic discovery during research |
| Execution Modes | Series Mode: Sequential topic processing Parallel Mode: Concurrent multi-topic processing with AsyncCitationManagerWrapper for thread-safe operations |
| Multi-Tool Integration | RAG (hybrid/naive),Query Item (entity lookup),Paper Search,Web Search,Code Execution — dynamically selected by ResearchAgent |
| Unified Citation System | Centralized CitationManager as single source of truth for citation ID generation, ref_number mapping, and deduplication |
| Preset Configurations | quick: Fast research (1-2 subtopics, 1-2 iterations) medium/standard: Balanced depth (5 subtopics, 4 iterations) deep: Thorough research (8 subtopics, 7 iterations) auto: Agent autonomously decides depth |
Citation System Architecture
The citation system follows a centralized design with CitationManager as the single source of truth:
┌─────────────────────────────────────────────────────────────────┐│ CitationManager ││ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ ││ │ ID Generation │ │ ref_number Map │ │ Deduplication │ ││ │ PLAN-XX │ │ citation_id → │ │ (papers only) │ ││ │ CIT-X-XX │ │ ref_number │ │ │ ││ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │└───────────┼────────────────────┼────────────────────┼───────────┘ │ │ │ ┌──────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐ │DecomposeAgent│ │ReportingAgent│ │ References │ │ ResearchAgent│ │ (inline [N]) │ │ Section │ │ NoteAgent │ └─────────────┘ └────────────┘ └─────────────┘| Component | Description |
|---|---|
| ID Format | PLAN-XX (planning stage RAG queries) +CIT-X-XX (research stage, X=block number) |
| ref_number Mapping | Sequential 1-based numbers built from sorted citation IDs, with paper deduplication |
| Inline Citations | Simple[N] format in LLM output, post-processed to clickable[[N]](#ref-N) links |
| Citation Table | Clear reference table provided to LLM:Cite as [1] → (RAG) query preview... |
| Post-processing | Automatic format conversion + validation to remove invalid citation references |
| Parallel Safety | Thread-safe async methods (get_next_citation_id_async,add_citation_async) for concurrent execution |
Parallel Execution Architecture
Whenexecution_mode: "parallel" is enabled, multiple topic blocks are researched concurrently:
┌─────────────────────────────────────────────────────────────────────────┐│ Parallel Research Execution │├─────────────────────────────────────────────────────────────────────────┤│ ││ DynamicTopicQueue AsyncCitationManagerWrapper ││ ┌─────────────────┐ ┌─────────────────────────┐ ││ │ Topic 1 (PENDING)│ ──┐ │ Thread-safe wrapper │ ││ │ Topic 2 (PENDING)│ ──┼──→ asyncio │ for CitationManager │ ││ │ Topic 3 (PENDING)│ ──┤ Semaphore │ │ ││ │ Topic 4 (PENDING)│ ──┤ (max=5) │ • get_next_citation_ │ ││ │ Topic 5 (PENDING)│ ──┘ │ id_async() │ ││ └─────────────────┘ │ • add_citation_async() │ ││ │ └───────────┬─────────────┘ ││ ▼ │ ││ ┌─────────────────────────────────────────────────────────────┐ ││ │ Concurrent ResearchAgent Tasks │ ││ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ ││ │ │ Task 1 │ │ Task 2 │ │ Task 3 │ │ Task 4 │ ... │ ││ │ │(Topic 1)│ │(Topic 2)│ │(Topic 3)│ │(Topic 4)│ │ ││ │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │ ││ │ │ │ │ │ │ ││ │ └────────────┴────────────┴────────────┘ │ ││ │ │ │ ││ │ ▼ │ ││ │ AsyncManagerAgentWrapper │ ││ │ (Thread-safe queue updates) │ ││ └─────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────┘| Component | Description |
|---|---|
asyncio.Semaphore | Limits concurrent tasks tomax_parallel_topics (default: 5) |
AsyncCitationManagerWrapper | Wraps CitationManager withasyncio.Lock() for thread-safe ID generation |
AsyncManagerAgentWrapper | Ensures queue state updates are atomic across parallel tasks |
| Real-time Progress | Live display of all active research tasks with status indicators |
Agent Responsibilities
| Agent | Phase | Responsibility |
|---|---|---|
| RephraseAgent | Planning | Optimizes user input topic, supports multi-turn user interaction for refinement |
| DecomposeAgent | Planning | Decomposes topic into subtopics with RAG context, obtains citation IDs from CitationManager |
| ManagerAgent | Researching | Queue state management, task scheduling, dynamic topic addition |
| ResearchAgent | Researching | Knowledge sufficiency check, query planning, tool selection, requests citation IDs before each tool call |
| NoteAgent | Researching | Compresses raw tool outputs into summaries, creates ToolTraces with pre-assigned citation IDs |
| ReportingAgent | Reporting | Builds citation map, generates three-level outline, writes report sections with citation tables, post-processes citations |
Report Generation Pipeline
1. Build Citation Map → CitationManager.build_ref_number_map()2. Generate Outline → Three-level headings (H1 → H2 → H3)3. Write Sections → LLM uses [N] citations with provided citation table4. Post-process → Convert [N] → [[N]](#ref-N), validate references5. Generate References → Academic-style entries with collapsible source detailsUsage
- Visithttp://localhost:{frontend_port}/research
- Enter research topic
- Select research mode (quick/medium/deep/auto)
- Watch real-time progress with parallel/series execution
- View structured report with clickable inline citations
- Export as Markdown or PDF (with proper page splitting and Mermaid diagram support)
CLI
# Quick mode (fast research)python src/agents/research/main.py --topic"Deep Learning Basics" --preset quick# Medium mode (balanced)python src/agents/research/main.py --topic"Transformer Architecture" --preset medium# Deep mode (thorough research)python src/agents/research/main.py --topic"Graph Neural Networks" --preset deep# Auto mode (agent decides depth)python src/agents/research/main.py --topic"Reinforcement Learning" --preset auto
Python API
importasynciofromsrc.agents.researchimportResearchPipelinefromsrc.core.coreimportget_llm_config,load_config_with_mainasyncdefmain():# Load configuration (main.yaml merged with any module-specific overrides)config=load_config_with_main("research_config.yaml")llm_config=get_llm_config()# Create pipeline (agent parameters loaded from agents.yaml automatically)pipeline=ResearchPipeline(config=config,api_key=llm_config["api_key"],base_url=llm_config["base_url"],kb_name="ai_textbook"# Optional: override knowledge base )# Run researchresult=awaitpipeline.run(topic="Attention Mechanisms in Deep Learning")print(f"Report saved to:{result['final_report_path']}")asyncio.run(main())
Output Location
data/user/research/├── reports/ # Final research reports│ ├── research_YYYYMMDD_HHMMSS.md # Markdown report with clickable citations [[N]](#ref-N)│ └── research_*_metadata.json # Research metadata and statistics└── cache/ # Research process cache └── research_YYYYMMDD_HHMMSS/ ├── queue.json # DynamicTopicQueue state (TopicBlocks + ToolTraces) ├── citations.json # Citation registry with ID counters and ref_number mapping │ # - citations: {citation_id: citation_info} │ # - counters: {plan_counter, block_counters} ├── step1_planning.json # Planning phase results (subtopics + PLAN-XX citations) ├── planning_progress.json # Planning progress events ├── researching_progress.json # Researching progress events ├── reporting_progress.json # Reporting progress events ├── outline.json # Three-level report outline structure └── token_cost_summary.json # Token usage statisticsCitation File Structure (citations.json):
{"research_id":"research_20241209_120000","citations": {"PLAN-01": {"citation_id":"PLAN-01","tool_type":"rag_hybrid","query":"...","summary":"..."},"CIT-1-01": {"citation_id":"CIT-1-01","tool_type":"paper_search","papers": [...],...} },"counters": {"plan_counter":2,"block_counters": {"1":3,"2":2} }}Configuration Options
Key configuration inconfig/main.yaml (research section) andconfig/agents.yaml:
# config/agents.yaml - Agent LLM parametersresearch:temperature:0.5max_tokens:12000# config/main.yaml - Research settingsresearch:# Execution Moderesearching:execution_mode:"parallel"# "series" or "parallel"max_parallel_topics:5# Max concurrent topicsmax_iterations:5# Max iterations per topic# Tool Switchesenable_rag_hybrid:true# Hybrid RAG retrievalenable_rag_naive:true# Basic RAG retrievalenable_paper_search:true# Academic paper searchenable_web_search:true# Web search (also controlled by tools.web_search.enabled)enable_run_code:true# Code execution# Queue Limitsqueue:max_length:5# Maximum topics in queue# Reportingreporting:enable_inline_citations:true# Enable clickable [N] citations in report# Presets: quick, medium, deep, auto# Global tool switches in tools sectiontools:web_search:enabled:true# Global web search switch (higher priority)
💡 Automated IdeaGen
Research idea generation system that extracts knowledge points from notebook records and generates research ideas through multi-stage filtering.
Core Features
| Feature | Description |
|---|---|
| MaterialOrganizerAgent | Extracts knowledge points from notebook records |
| Multi-Stage Filtering | Loose Filter →Explore Ideas (5+ per point) →Strict Filter →Generate Markdown |
| Idea Exploration | Innovative thinking from multiple dimensions |
| Structured Output | Organized markdown with knowledge points and ideas |
| Progress Callbacks | Real-time updates for each stage |
Usage
- Visithttp://localhost:{frontend_port}/ideagen
- Select a notebook with records
- Optionally provide user thoughts/preferences
- Click "Generate Ideas"
- View generated research ideas organized by knowledge points
Python API
importasynciofromsrc.agents.ideagenimportIdeaGenerationWorkflow,MaterialOrganizerAgentfromsrc.core.coreimportget_llm_configasyncdefmain():llm_config=get_llm_config()# Step 1: Extract knowledge points from materialsorganizer=MaterialOrganizerAgent(api_key=llm_config["api_key"],base_url=llm_config["base_url"] )knowledge_points=awaitorganizer.extract_knowledge_points("Your learning materials or notebook content here" )# Step 2: Generate research ideasworkflow=IdeaGenerationWorkflow(api_key=llm_config["api_key"],base_url=llm_config["base_url"] )result=awaitworkflow.process(knowledge_points)print(result)# Markdown formatted research ideasasyncio.run(main())
📊 Dashboard + Knowledge Base Management
Unified system entry providing activity tracking, knowledge base management, and system status monitoring.
Key Features
| Feature | Description |
|---|---|
| Activity Statistics | Recent solving/generation/research records |
| Knowledge Base Overview | KB list, statistics, incremental updates |
| Notebook Statistics | Notebook counts, record distribution |
| Quick Actions | One-click access to all modules |
Usage
- Web Interface: Visithttp://localhost:{frontend_port} to view system overview
- Create KB: Click "New Knowledge Base", upload PDF/Markdown documents
- View Activity: Check recent learning activities on Dashboard
📓 Notebook
Unified learning record management, connecting outputs from all modules to create a personalized learning knowledge base.
Core Features
| Feature | Description |
|---|---|
| Multi-Notebook Management | Create, edit, delete notebooks |
| Unified Record Storage | Integrate solving/generation/research/Interactive IdeaGen records |
| Categorization Tags | Auto-categorize by type, knowledge base |
| Custom Appearance | Color, icon personalization |
Usage
- Visithttp://localhost:{frontend_port}/notebook
- Create new notebook (set name, description, color, icon)
- After completing tasks in other modules, click "Add to Notebook"
- View and manage all records on the notebook page
| Configuration | Data Directory | API Backend | Core Utilities |
| Knowledge Base | Tools | Web Frontend | Solve Module |
| Question Module | Research Module | Interactive IdeaGen Module | Guide Module |
| Automated IdeaGen Module | |||
Backend fails to start?
Checklist
- Confirm Python version >= 3.10
- Confirm all dependencies installed:
pip install -r requirements.txt - Check if port 8001 is in use (configurable in
config/main.yaml) - Check
.envfile configuration
Solutions
- Change port: Edit
config/main.yamlserver.backend_port - Check logs: Review terminal error messages
Port occupied after Ctrl+C?
Problem
After pressing Ctrl+C during a running task (e.g., deep research), restarting shows "port already in use" error.
Cause
Ctrl+C sometimes only terminates the frontend process while the backend continues running in the background.
Solution
# macOS/Linux: Find and kill the processlsof -i :8001kill -9<PID># Windows: Find and kill the processnetstat -ano| findstr :8001taskkill /PID<PID> /F
Then restart the service withpython scripts/start_web.py.
npm: command not found error?
Problem
Runningscripts/start_web.py showsnpm: command not found or exit status 127.
Checklist
- Check if npm is installed:
npm --version - Check if Node.js is installed:
node --version - Confirm conda environment is activated (if using conda)
Solutions
# Option A: Using Conda (Recommended)conda install -c conda-forge nodejs# Option B: Using Official Installer# Download from https://nodejs.org/# Option C: Using nvmnvm install 18nvm use 18
Verify Installation
node --version# Should show v18.x.x or highernpm --version# Should show version number
Frontend cannot connect to backend?
Checklist
- Confirm backend is running (visithttp://localhost:8001/docs)
- Check browser console for error messages
Solution
Create.env.local inweb directory:
NEXT_PUBLIC_API_BASE=http://localhost:8001
WebSocket connection fails?
Checklist
- Confirm backend is running
- Check firewall settings
- Confirm WebSocket URL is correct
Solution
- Check backend logs
- Confirm URL format:
ws://localhost:8001/api/v1/...
Where are module outputs stored?
| Module | Output Path |
|---|---|
| Solve | data/user/solve/solve_YYYYMMDD_HHMMSS/ |
| Question | data/user/question/question_YYYYMMDD_HHMMSS/ |
| Research | data/user/research/reports/ |
| Interactive IdeaGen | data/user/co-writer/ |
| Notebook | data/user/notebook/ |
| Guide | data/user/guide/session_{session_id}.json |
| Logs | data/user/logs/ |
How to add a new knowledge base?
Web Interface
- Visithttp://localhost:{frontend_port}/knowledge
- Click "New Knowledge Base"
- Enter knowledge base name
- Upload PDF/TXT/MD documents
- System will process documents in background
CLI
python -m src.knowledge.start_kb init<kb_name> --docs<pdf_path>
How to incrementally add documents to existing KB?
CLI (Recommended)
python -m src.knowledge.add_documents<kb_name> --docs<new_document.pdf>
Benefits
- Only processes new documents, saves time and API costs
- Automatically merges with existing knowledge graph
- Preserves all existing data
Numbered items extraction failed with uvloop.Loop error?
Problem
When initializing a knowledge base, you may encounter this error:
ValueError: Can't patch loop of type <class 'uvloop.Loop'>This occurs because Uvicorn usesuvloop event loop by default, which is incompatible withnest_asyncio.
Solution
Use one of the following methods to extract numbered items:
# Option 1: Using the shell script (recommended)./scripts/extract_numbered_items.sh<kb_name># Option 2: Direct Python commandpython src/knowledge/extract_numbered_items.py --kb<kb_name> --base-dir ./data/knowledge_bases
This will extract numbered items (Definitions, Theorems, Equations, etc.) from your knowledge base without reinitializing it.
This project is licensed under theAGPL-3.0 License.
We welcome contributions from the community! To ensure code quality and consistency, please follow the guidelines below.
Development Setup
This project usespre-commit hooks to automatically format code and check for issues before commits.
Step 1: Install pre-commit
# Using pippip install pre-commit# Or using condaconda install -c conda-forge pre-commit
Step 2: Install Git hooks
cd DeepTutorpre-commit installStep 3: (Optional) Run checks on all files
pre-commit run --all-files
Every time you rungit commit, pre-commit hooks will automatically:
- Format Python code with Ruff
- Format frontend code with Prettier
- Check for syntax errors
- Validate YAML/JSON files
- Detect potential security issues
| Tool | Purpose | Configuration |
|---|---|---|
| Ruff | Python linting & formatting | pyproject.toml |
| Prettier | Frontend code formatting | web/.prettierrc.json |
| detect-secrets | Security check | .secrets.baseline |
Note: The project usesRuff format instead of Black to avoid formatting conflicts.
# Normal commit (hooks run automatically)git commit -m"Your commit message"# Manually check all filespre-commit run --all-files# Update hooks to latest versionspre-commit autoupdate# Skip hooks (not recommended, only for emergencies)git commit --no-verify -m"Emergency fix"
- Fork and Clone: Fork the repository and clone your fork
- Create Branch: Create a feature branch from
main - Install Pre-commit: Follow the setup steps above
- Make Changes: Write your code following the project's style
- Test: Ensure your changes work correctly
- Commit: Pre-commit hooks will automatically format your code
- Push and PR: Push to your fork and create a Pull Request
- Use GitHub Issues to report bugs or suggest features
- Provide detailed information about the issue
- Include steps to reproduce if it's a bug
❤️ We thank all our contributors for their valuable contributions.
| ⚡ LightRAG | 🎨 RAG-Anything | 💻 DeepCode | 🔬 AI-Researcher |
|---|---|---|---|
| Simple and Fast RAG | Multimodal RAG | AI Code Assistant | Research Automation |
⭐ Star us ·🐛 Report a bug ·💬 Discussions
✨ Thanks for visitingDeepTutor!
About
"DeepTutor: AI-Powered Personalized Learning Assistant"
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Languages
- Python61.3%
- TypeScript37.0%
- Other1.7%















