Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance

License

NotificationsYou must be signed in to change notification settings

InternLM/HuixiangDou

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

887 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HuixiangDou2(ACL25) is a GraphRAG solution whose effectiveness has been demonstrated in the plant-science domain and that contributed to thecover paper in Cell Molecular Plant. If you work outside computer science, give the new release a try.


English |简体中文

WechatReadthedocsYouTubeBiliBilidiscordArxivArxiv

HuixiangDou1 is aprofessional knowledge assistant based on LLM.

Advantages:

  1. Design three-stage pipelines of preprocess, rejection and response
  2. No training required, with CPU-only, 2G, 10G configuration
  3. Offers a complete suite of Web, Android, and pipeline source code, industrial-grade and commercially viable

Check out thescenes in which HuixiangDou are running and current public service status:

  • readthedocs ChatWithAI (cpu-only) is available
  • OpenXLab is using GPU and under continuous maintenance
  • WeChat bot has a cost associated with WeChat integration. All code has been verified to be functional for one year. Please deploy it on your own for either thefree orcommercial version.

If this helps you, please give it a star ⭐

🔆 New Features

Our Web version has been released toOpenXLab, where you can create knowledge base, update positive and negative examples, turn on web search, test chat, and integrate into Feishu/WeChat groups. SeeBiliBili andYouTube !

The Web version's API for Android also supports other devices. SeePython sample code.

📖 Support Status

LLMFile FormatRetrieval MethodIntegrationPreprocessing
  • excel
  • html
  • markdown
  • pdf
  • ppt
  • txt
  • word

📦 Hardware Requirements

The following are the GPU memory requirements for different features, the difference lies only in whether theoptions are turned on.

Configuration ExampleGPU mem RequirementsDescriptionVerified on Linux
config-cpu.ini-Usesiliconcloud API
for text only
[Standard Edition]config.ini2GBUse openai API (such askimi,deepseek andstepfun to search for text only
config-multimodal.ini10GBUse openai API for LLM, image and text retrieval

🔥 Running the Standard Edition

We take the standard edition (local running LLM, text retrieval) as an introduction example. Other versions are just different in configuration options.

I. Download and install dependencies

Click to agree to the BCE model agreement, log in huggingface

huggingface-cli login

Install dependencies

# parsing `word` format requirementsapt updateapt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-dev# python requirementspip install -r requirements.txt# For python3.8, install faiss-gpu instead of faiss

II. Create knowledge base

We use some novels to build knowledge base and filtering questions. If you have your own documents, just put them underrepodir.

Copy and execute all the following commands (including the '#' symbol).

# Download the knowledge base, we only take the some documents as example. You can put any of your own documents under `repodir`cd HuixiangDoumkdir repodircp -rf resource/data* repodir/# Build knowledge base, this will save the features of repodir to workdir, and update the positive and negative example thresholds into `config.ini`mkdir workdirpython3 -m huixiangdou.services.store# You can also build knowledge base from QA pairs (CSV or JSON format)# CSV: First column is key (question), second column is value (answer)# JSON: {"question1": "answer1", "question2": "answer2", ...}# python3 -m huixiangdou.services.store --qa-pair resource/data/qa_pair.csv

III. Setup LLM API and test

Set the model andapi-key inconfig.ini. If running LLM locally, we recommend usingvllm.

vllm serve /path/to/Qwen-2.5-7B-Instruct --served-model-name vllm --enable-prefix-caching --served-model-name Qwen-2.5-7B-Instruct

Here is an example of the configuredconfig.ini:

[llm.server]remote_type ="kimi"remote_api_key ="sk-dp3GriuhhLXnYo0KUuWbFUWWKOXXXXXXXXXX"remote_llm_model ="auto"# remote_type = "step"# remote_api_key = "5CpPyYNPhQMkIzs5SYfcdbTHXq3a72H5XXXXXXXXXXXXX"# remote_llm_model = "auto"# remote_type = "deepseek"# remote_api_key = "sk-86db9a205aa9422XXXXXXXXXXXXXX"# remote_llm_model = "deepseek-chat"# remote_type = "vllm"# remote_api_key = "EMPTY"# remote_llm_model = "Qwen2.5-7B-Instruct"# remote_type = "siliconcloud"# remote_api_key = "sk-xxxxxxxxxxxxx"# remote_llm_model = "alibaba/Qwen1.5-110B-Chat"# remote_type = "ppio"# remote_api_key = "sk-xxxxxxxxxxxxx"# remote_llm_model = "thudm/glm-4-9b-chat"

Then run the test:

# Respond to questions related to the Hundred-Plant Garden (related to the knowledge base), but do not respond to weather questions.python3 -m huixiangdou.main+-----------------------+---------+--------------------------------+-----------------+|         Query         |  State  |         Reply                  |   References    |+=======================+=========+================================+=================+| What is in the Hundred-Plant Garden? | success | The Hundred-Plant Garden has a rich variety of natural landscapes and life... | installation.md |--------------------------------------------------------------------------------------| How is the weather today?         | Init state| ..                           |                 |+-----------------------+---------+--------------------------------+-----------------+🔆 Input your question here, type `bye` for exit:..

💡 Also run a simple Web UI withgradio:

python3 -m huixiangdou.gradio_ui

output.mp4

Or run a server to listen 23333, default pipeline ischat_with_repo:

python3 -m huixiangdou.api_server# test async APIcurl -X POST http://127.0.0.1:23333/huixiangdou_stream  -H"Content-Type: application/json" -d'{"text": "how to install mmpose","image": ""}'# cURL sync APIcurl -X POST http://127.0.0.1:23333/huixiangdou_inference  -H"Content-Type: application/json" -d'{"text": "how to install mmpose","image": ""}'

Please update therepodir documents,good_questions andbad_questions, and try your own domain knowledge (medical, financial, power, etc.).

IV. Integration

To Feishu, WeChat group

To web front and backend

We providetypescript front-end andpython back-end source code:

  • Multi-tenant management supported
  • Zero programming access to Feishu and WeChat
  • k8s friendly

Same asOpenXlab APP, please read theweb deployment document.

To readthedocs.io

Try right-bottom button on the page anddocument.

🍴 Other Configurations

CPU-only Edition

If there is no GPU available, model inference can be completed using thesiliconcloud API.

Taking docker miniconda+Python3.11 as an example, install CPU dependencies and run:

# Start containerdocker run -v /path/to/huixiangdou:/huixiangdou -p 7860:7860 -p 23333:23333 -it continuumio/miniconda3 /bin/bash# Install dependenciesapt updateapt install python-dev libxml2-dev libxslt1-dev antiword unrtf poppler-utils pstotext tesseract-ocr flac ffmpeg lame libmad0 libsox-fmt-mp3 sox libjpeg-dev swig libpulse-devpython3 -m pip install -r requirements-cpu.txt# Establish knowledge basepython3 -m huixiangdou.services.store --config_path config-cpu.ini# Q&A testpython3 -m huixiangdou.main --config_path config-cpu.ini# gradio UIpython3 -m huixiangdou.gradio_ui --config_path config-cpu.ini

If you find the installation too slow, a pre-installed image is provided inDocker Hub. Simply replace it when starting the docker.

10G Multimodal Edition

If you have 10G GPU mem, you can further support image and text retrieval. Just modify the model used in config.ini.

# config-multimodal.ini# !!! Download `https://huggingface.co/BAAI/bge-visualized/blob/main/Visualized_m3.pth`    to `bge-m3` folder !!!embedding_model_path ="BAAI/bge-m3"reranker_model_path ="BAAI/bge-reranker-v2-minicpm-layerwise"

Note:

Run gradio to test, see the image and text retrieval resulthere.

python3 tests/test_query_gradio.py

Furthermore

Please read the following topics:

🛠️ FAQ

  1. What if the robot is too cold/too chatty?

    • Fill in the questions that should be answered in the real scenario intoresource/good_questions.json, and fill the ones that should be rejected intoresource/bad_questions.json.
    • Adjust the theme content inrepodir to ensure that the markdown documents in the main library do not contain irrelevant content.

    Re-runfeature_store to update thresholds and feature libraries.

    ⚠️ You can directly modifyreject_throttle in config.ini. Generally speaking, 0.5 is a high value; 0.2 is too low.

  2. Launch is normal, but out of memory during runtime?

    LLM long text based on transformers structure requires more memory. At this time, kv cache quantization needs to be done on the model, such aslmdeploy quantization description. Then use docker to independently deploy Hybrid LLM Service.

  3. No module named 'faiss.swigfaiss_avx2'

    locate installedfaiss package

    importfaissprint(faiss.__file__)# /root/.conda/envs/InternLM2_Huixiangdou/lib/python3.10/site-packages/faiss/__init__.py

    add soft link

    # cd your_python_path/site-packages/faisscd /root/.conda/envs/InternLM2_Huixiangdou/lib/python3.10/site-packages/faiss/ln -s swigfaiss.py swigfaiss_avx2.py

🍀 Acknowledgements

📝 Citation

@misc{kong2024huixiangdou,      title={HuiXiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance},      author={Huanjun Kong and Songyang Zhang and Jiaying Li and Min Xiao and Jun Xu and Kai Chen},      year={2024},      eprint={2401.08772},      archivePrefix={arXiv},      primaryClass={cs.CL}}@misc{kong2024labelingsupervisedfinetuningdata,      title={Labeling supervised fine-tuning data with the scaling law},       author={Huanjun Kong},      year={2024},      eprint={2405.02817},      archivePrefix={arXiv},      primaryClass={cs.CL},      url={https://arxiv.org/abs/2405.02817}, }@misc{kong2025huixiangdou2robustlyoptimizedgraphrag,      title={HuixiangDou2: A Robustly Optimized GraphRAG Approach},       author={Huanjun Kong and Zhefan Wang and Chenyang Wang and Zhe Ma and Nanqing Dong},      year={2025},      eprint={2503.06474},      archivePrefix={arXiv},      primaryClass={cs.IR},      url={https://arxiv.org/abs/2503.06474}, }

[8]ページ先頭

©2009-2026 Movatter.jp