HKUDS/LightRAGPublic

NotificationsYou must be signed in to change notification settings
Fork4.1k
Star28.5k

v1.4.9.11

15 Jan 18:27

411b1ce

Hot Fixed

Fix OpenAI LLM binding options not loaded from environment variables by@danielaskdd in#2585

What's New

feat(gemini): Add Vertex AI support for Gemini LLM binding by@danielaskdd in#2529
refact(gemini): Migrate Gemini LLM to native async Google GenAI client by@danielaskdd in#2531
Refact: Change DOCX extraction to use HTML tags for whitespace by@danielaskdd in#2550
feat: add Korean localization by@jhchoi1182 in#2571
Add support for mdx file type by@coldfire-x in#2566
Add i18n support for German, Ukrainian, Russian, and Japanese languages by@mlimarenko in#2547

What's Fixed

docs: fix the simple program rag init function return value in README.md by@Peefy in#2532
docs: fix the simple program rag init function return value in README-zh.md by@Peefy in#2534
feat: Implement WebUI Token Auto-Renewal (Sliding Window Expiration) by@danielaskdd in#2543
Fixes the Gemini integration example in the README by@vishvaRam in#2537
Add Gemini demo for LightRAG by@vishvaRam in#2538
Add LightRAG demo with PostgreSQL and Gemini integration by@vishvaRam in#2556
Update PostgreSQL demo script reference in README.md by@vishvaRam in#2557
Fix: Enhance PostgreSQL Reconnection Tolerance for HA Deployments by@danielaskdd in#2562
Add NEO4J_DATABASE variable to README by@vishvaRam in#2578
Bump the frontend-minor-patch group in /lightrag_webui with 2 updates by@dependabot[bot] in#2577
Add LightRAG demo script with vLLM integration by@vishvaRam in#2582

New Contributors

@Peefy made their first contribution in#2532
@vishvaRam made their first contribution in#2537
@mlimarenko made their first contribution in#2547
@jhchoi1182 made their first contribution in#2571
@coldfire-x made their first contribution in#2566

Full Changelog:v1.4.9.10...v1.4.9.11

Contributors

coldfire-x, Peefy, and 5 other contributors

Assets2

v1.4.9.10

23 Dec 01:10

danielaskdd

v1.4.9.10

8c8186a

This commit was created on GitHub.com and signed with GitHub’sverified signature.

GPG key ID:B5690EEEBB952194

Verified

Learn about vigilant mode.

v1.4.9.10

What's Changed

Hot Fix AttributeError in Neo4JStorage and MemgraphStorage when using storage specified workspace env var by@danielaskdd in#2526

Full Changelog:v1.4.9.9...v1.4.9.10

Contributors

danielaskdd

Assets2

6 people reacted

v1.4.9.9

22 Dec 17:49

danielaskdd

v1.4.9.9

dccf1ef

This commit was created on GitHub.com and signed with GitHub’sverified signature.

GPG key ID:B5690EEEBB952194

Verified

Learn about vigilant mode.

v1.4.9.9

Release Note V1.4.9.9

Important Notes

Add Workspace Isolation for Pipeline Status and In-memory Storage: Multiple LightRAG instances with distinct workspaces can be created simultaneously, marking a significant advancement toward seamless workspace switching within a single LightRAG server.
Add Workspace Vector Data Isolation by Model Name and Dimension for PostgreSQL and Qdrant: Previously, LightRAG used a single collection/table for difference embedding model and dimension, which caused dimension mismatch crashes or data pollution in multi-workspace.
Dimension Selection is Supported for OpenAI and Gemini Embedding model with new env var introduced:EMBEDDING_SEND_DIM
Add LLM Cache Migration and LLM Query Cache Cleanup Tools Between Different KV Storage: enabling users to switch storage backends without losing cached extraction and summary data.
Enhanced EnhancedDOCX Extraction with Table Content Support.
Enhanced XLSX extraction with proper handling oftab andnewline characters within cells.
Fix Critical Security Vulnerability in React Server Components:#2494
Add Automatic Text Truncation Support for Embedding Functions: OpenAI embedding function now respect max_token_size value in EmbeddingFunc, and automatic truncate input text to prevent API errors caused by texts exceeding model token limits.

What's Breaking (for LightRAG Core integration only)

Rename params of chunking function: If you incorporate the chunking function into LightRAG and pass parameters by name, corresponding code updates are required.

defchunking_by_token_size(tokenizer:Tokenizer,content:str,split_by_character:str|None=None,split_by_character_only:bool=False,chunk_overlap_token_size:int=100,chunk_token_size:int=1200,)->list[dict[str,Any]]:

Inject an embedding_func with model_name to LightRAG using wrap_embedding_func_with_attrs:

@wrap_embedding_func_with_attrs(    embedding_dim=1536, max_token_size=8192, model_name="text-embedding-3-small")@retry(    stop=stop_after_attempt(3),    wait=wait_exponential(multiplier=1, min=4, max=60),    retry=(        retry_if_exception_type(RateLimitError)        | retry_if_exception_type(APIConnectionError)        | retry_if_exception_type(APITimeoutError)    ),)async def embedding_func(texts: list[str]) -> np.ndarray:    client = AzureOpenAI(        api_key=AZURE_OPENAI_API_KEY,        api_version=AZURE_EMBEDDING_API_VERSION,        azure_endpoint=AZURE_OPENAI_ENDPOINT,    )    embedding = client.embeddings.create(model=AZURE_EMBEDDING_DEPLOYMENT, input=texts)    embeddings = [item.embedding for item in embedding.data]    return np.array(embeddings)rag = LightRAG(      working_dir=WORKING_DIR,      llm_model_func=llm_model_func,      embedding_func=embedding_func,)

To ensure seamless transition, legacy code injecting embedding_func without model_name will continue to interface with the original non-suffixed vector tables.

What's New

Feat: Add Chain of Thought Support for Gemini LLM by@danielaskdd in#2326
Feat: Add Optional Embedding Dimension Control with OpenAI API by@danielaskdd in#2328
Feat: Add Gemini Embedding Support to LightRAG by@danielaskdd in#2329
Feat: Add LLM Cache Migration Tool by@danielaskdd in#2330
Feat: Add LLM Query Cache Cleanup Tool by@danielaskdd in#2335
Support async chunking func to improve processing performance when a heavychunking_func is passed in by user by@tongda in#2336
Add ollama cloud support by@LacombeLouis in#2348
Feat: Add Workspace Isolation for Pipeline Status and In-memory Storage by@danielaskdd in#2369
feat: add vchordrq vector index support for PostgreSQL by@wmsnp in#2378
Feat: Enhanced DOCX Extraction with Table Content Support by@danielaskdd in#2383
Feat: Enhance XLSX Extraction by Adding Separators and Escape Special Characters by@danielaskdd in#2386
Optimize for OpenAI Prompt Caching: Restructure entity extraction pro… by@Ghazi-raad in#2426
feat: Vector Storage Model Isolation with Automatic Migration by@BukeLy in#2391
feat: Implement Vector Database Model Isolation and Auto-Migration by@danielaskdd in#2513
feat: Add Automatic Text Truncation Support for Embedding Functions by@danielaskdd in#2523

What's Changed

Fix: Remove Duplicate Entity/Realtion Tracking Deletion in adelete_by_doc_id by@danielaskdd in#2322
Fix spelling errors in the "使用PostgreSQL存储" section of README-zh.md by@huangbhan in#2327
Add dimensions parameter support to openai_embed() by@yrangana in#2323
Fix Gemini driver retry mechanism by@danielaskdd in#2331
HotFix: Restore OpenAI Streaming Response & Refactor keyword_extraction Parameter by@danielaskdd in#2334
Refactor: Migrate PDF processing dependency frompypdf2 to activelypypdf by@danielaskdd in#2338
Fix: Prevent UnicodeEncodeError in JSON storage operations by@danielaskdd in#2344
Remove deprecated response_type parameter from query settings UI by@danielaskdd in#2345
Refactor: Optimize write_json for Memory Efficiency and Performance by@danielaskdd in#2346
Refact: Remove blocking dependency installation from document upload handlers by@danielaskdd in#2350
Refact: Implement Lazy Configuration Initialization for API Server by@danielaskdd in#2351
Refact: Enhance DOCLING integration with lazy loading and macOS safeguards by@danielaskdd in#2352
Fix: Robust error handling for async database operations in graph storage by@danielaskdd in#2356
Update the value corresponding to the extracted entity relationship keywords by@sleeepyin in#2358
Add macOS fork safety check for Gunicorn multi-worker mode by@danielaskdd in#2360
Refact: Add Embedding Token Limit Configuration and Improve Error Handling by@danielaskdd in#2359
Refact: Add Embedding Dimension Validation in EmbeddingFunc by@danielaskdd in#2368
test: Convert test_workspace_isolation.py to pytest style by@BukeLy in#2371
refactor(chunking): rename params and improve docstring for chunking by@EightyOliveira in#2379
Fix: Add chunk token limit validation with detailed error reporting by@danielaskdd in#2389
Fix: Remove redundant exception logging to eliminate pytest shutdown errors by@danielaskdd in#2390
issue-2394: use deployment variable instead of model for embeddings API call by@Amrit75 in#2395
Refactor: Centralize keyword_extraction parameter handling in OpenAI LLM implementations by@danielaskdd in#2401
Refact: Consolidate Azure OpenAI and OpenAI implementations by@danielaskdd in#2403
Update README.md by@chaohuang-ai in#2408
Update README.md by@chaohuang-ai in#2409
feat: create copilot-setup-steps.yml by@netbrah in#2410
Fix: Add Comprehensive Retry Mechanism for Neo4j Storage Operations by@danielaskdd in#2417
Refact: Allow API Server to Start Without Built WebUI Assets by@danielaskdd in#2418
fix:exception handling order error by@EightyOliveira in#2421
Doc: Update README examples to prevent double-wrapping of embedding functions by@danielaskdd in#2432
Fix: Add configurable model support for Jina embedding by@danielaskdd in#2433
Fix typos discovered by codespell by@cclauss in#2434
Update README.md by@chaohuang-ai in#2439
Fix KaTeX chemistry formula rendering (\ce command) not working by@danielaskdd in#2443
fix(postgres): Add CASCADE to AGE extension creation for automatic dependency resolution by@danielaskdd in#2446
Add Python 3.13 and 3.14 to the testing by@cclauss in#2436
Keep GitHub Actions up to date with GitHub's Dependabot by@cclauss in#2435
chore: optimize Dependabot configuration with dependency grouping and PR limits by@danielaskdd in#2447
...

Contributors

mccahill, tongda, and 15 other contributors

Assets2

v1.4.9.8

06 Nov 13:51

danielaskdd

v1.4.9.8

5bcd292

v1.4.9.8

What's New

Feat: Add PDF Decryption Support for Password-Protected Files by@danielaskdd in#2296
Feat: Add optional Langfuse observability integration by@anouar-bm in#2298
Feat: Add RAGAS evaluation framework for RAG quality assessment by@anouar-bm in#2297
Feat: Add native gemini LLM support by@Humphryshikunzi in#2305

What's Changed

Refact: Auto-refresh of Popular Labels When Pipeline Completes by@danielaskdd in#2291
Fix empty context validation bug and improve naming consistency in query context building by@danielaskdd in#2295
Refact: Enhanced RAG Evaluation CLI with Two-Stage Pipeline and Improved UX by@danielaskdd in#2311
Refact: Separate Configuration of RAGAS for LLM and Embeddings by@danielaskdd in#2314
Refactor: Remove Deprecated Chunk-Based Query Methods and Improve Graph Unit Test by@danielaskdd in#2319
Fix node retrieval fail with special characters in IDs for Postgres AGE GraphStorage by@danielaskdd in#2320
Fix performance bottleneck in document deletion by@danielaskdd in#2321

New Contributors

@anouar-bm made their first contribution in#2298
@Humphryshikunzi made their first contribution in#2305

Full Changelog:v1.4.9.7...v1.4.9.8

Contributors

danielaskdd, Humphryshikunzi, and anouar-bm

Assets2

4 people reacted

v1.4.9.7

30 Oct 18:50

danielaskdd

v1.4.9.7

94cdbe7

This commit was created on GitHub.com and signed with GitHub’sverified signature.

GPG key ID:B5690EEEBB952194

Verified

Learn about vigilant mode.

v1.4.9.7

Important Notes

This update requires qdrant-client version 1.11.0 or later (due to the use of tenant indexing) when using Qdrant. Data migration may take a significant amount of time for large datasets.

What's Changed

Refactor: Qdrant Multi-tenancy with Payload-Based Partitioning by@Anush008 in#2247
Refact: Enhance Property editing UI for KG Nodes by@danielaskdd in#2287
Fix: Add PyCryptodome dependency for encrypted PDF processing by@danielaskdd in#2289
Fix: Clean Residual Edges from VDB During Entity Deletion by@danielaskdd in#2290

New Contributors

@Anush008 made their first contribution in#2247

Full Changelog:v1.4.9.6...v1.4.9.7

Contributors

danielaskdd and Anush008

Assets2

1 person reacted

v1.4.9.6 Hotfix

30 Oct 02:54

danielaskdd

v1.4.9.6

8145201

This commit was created on GitHub.com and signed with GitHub’sverified signature.

GPG key ID:B5690EEEBB952194

Verified

Learn about vigilant mode.

v1.4.9.6 Hotfix

What's Changed

Restore query generation example and fix README path reference by@danielaskdd in#2281
Refact: Graceful shutdown and signal handling in Gunicorn Mode by@danielaskdd in#2280
HotFix: Include swagger-docs static files in package distribution by@danielaskdd in#2284

Full Changelog:v1.4.9.5...v1.4.9.6

Contributors

danielaskdd

Assets2

v1.4.9.5

29 Oct 01:20

danielaskdd

v1.4.9.5

ec79727

This commit was created on GitHub.com and signed with GitHub’sverified signature.

GPG key ID:B5690EEEBB952194

Verified

Learn about vigilant mode.

v1.4.9.5

Important Notes

🚀 PostgreSQL migration performance problem for large dataset in v1.4.9.4
🛑 Introduces a graceful pipeline cancellation mechanism for document processing operations

✏️ Enable entity merging when renaming an entity to a target entity that already exists.

What's New

Feat: Add Pipeline Cancellation Feature with Enhanced Reliability by@danielaskdd in#2258
Allow users to provide keywords with QueryRequest by@Mobious in#2253
Refact: Add offline Swagger UI support with custom static file serving by@danielaskdd in
Refactor: Enhanced Entity Merging with Chunk Tracking by@danielaskdd in#2266

What's Fixed

Fix: PostgreSQL Data Migration Performance Problem by@danielaskdd in#2259
Fix: Ensure Storage Consistency When Creating Implicit Nodes from Relationships by@danielaskdd in#2262
Refactor: Enhance KG Editing with Chunk Tracking by@danielaskdd in#2265
#2273
Update redis requirement from <7.0.0,>=5.0.0 to >=5.0.0,<8.0.0 by@dependabot[bot] in#2272
Fix Entity Source IDs Tracking ProblemDuring Relationship Processing by@danielaskdd in#2279

New Contributors

@Mobious made their first contribution in#2253

Full Changelog:v1.4.9.4...v1.4.9.5

Contributors

Mobious, dependabot, and danielaskdd

Assets2

2 people reacted

v1.4.9.4

22 Oct 15:52

danielaskdd

v1.4.9.4

fdf0fe0

v1.4.9.4

Important Notes: Eliminate Bottlenecks in Processing Large-scale Datasets

In production deployments, entity and relation metadata can grow unbounded as documents are continuously ingested. The source_id (chunk IDs) and file_path fields in entities and relations can accumulate thousands of entries, leading to:

Performance degradation in vector database operations
Increased storage costs
Memory pressure during query operations
Slower merge operations when processing new documents

LightRAG implements a configurable metadata size control system with two key features:

Source ID limiting: Controls the maximum number of chunk IDs stored per entity/relation
File path limiting: Controls the maximum number of file paths displayed in metadata (display-only, doesn't affect query performance)

Both features support two strategies:

FIFO (First In First Out): Removes oldest entries when limit is reached. Best for evolving knowledge bases, keeps most recent information.
KEEP: Keeps oldest entries, skips new ones when limit is reached. Best for stable knowledge bases, faster (fewer merge operations)

New environment variables with default values:

# Source ID limits (affects query performance)MAX_SOURCE_IDS_PER_ENTITY=300MAX_SOURCE_IDS_PER_RELATION=300SOURCE_IDS_LIMIT_METHOD=FIFO# File path limits (display only)MAX_FILE_PATHS=100

Auto Data Migration

Upgrading to this version requires data migration. If your current system contains a large number of entity relationships, the upgrade process may take an extended period of time.

What's New

Feat: Add offline Docker build support with embedded models and cache by@danielaskdd in#2222
Refact: Limit Vector Database Metadata Size toSupport Large Scale Dataset by@danielaskdd in#2240
Feat: Add Optional LLM Cache Deletion for Document Deletion by@danielaskdd in#2244
Refact: Add Entity Identifier Length Truncation to Prevent Storage Failures by@danielaskdd in#2245
Refact: Add Multimodal Processing Status Support to DocProcessingStatus forRayAnything Compatibility by@danielaskdd in#2248

What's Changed

Refact: Improve query result with semantic null returns by@danielaskdd in#2218
remove deprecated dotenv package. by@wkpark in#2229
Refact: Frontend UI Fixes and Performance Improvements by@danielaskdd in#2234
Security: Fix SQL injection vulnerabilities in PostgreSQL storage by@lucky-verma in#2235
Update openai requirement from <2.0.0,>=1.0.0 to >=1.0.0,<3.0.0 by@dependabot[bot] in#2238
Update pandas requirement from <2.3.0,>=2.0.0 to >=2.0.0,<2.4.0 by@dependabot[bot] in#2239
Optimize PostgreSQL initialization performance by@yrangana in#2237
fix(docs): correct typo "acivate" → "activate" by@xiaojunxiang2023 in#2243

New Contributors

@wkpark made their first contribution in#2229
@lucky-verma made their first contribution in#2235
@dependabot[bot] made their first contribution in#2238
@xiaojunxiang2023 made their first contribution in#2243

Full Changelog:v1.4.9.3...v1.4.9.4

Contributors

wkpark, dependabot, and 4 other contributors

Assets2

v1.4.9.3

14 Oct 07:06

danielaskdd

v1.4.9.3

965d8b1

v1.4.9.3

Important Notes

Addtemporary solution implemented to ensure compatibility with the newly introduced document status in Rayanything.
Frontend build artifacts is removed from git repo now,manual build action is required after cloning/pulling the repo.

What's Changed

Refactor: WebUI Optimization and Simplification by@danielaskdd in#2198
i18n: fix mustache brackets by@zl7261 in#2196
fix: advise excluding dev dependencies in prod build by@kevinnkansah in#2201
Exclude Frontend Build Artifacts from Git Repository by@danielaskdd in#2208
Add PREPROCESSED (multimodal_processed) status for multimodal document processing by@danielaskdd in#2211

Full Changelog:v1.4.9.2...v1.4.9.3

Contributors

zl7261, danielaskdd, and kevinnkansah

Assets2

v1.4.9.2

11 Oct 05:52

danielaskdd

v1.4.9.2

fbcc35b

v1.4.9.2

What's New

feat: Add endpoint and UI to retry failed documents by@RooseveltAdvisors in#2168
Refactor(webui): Improve document tooltip display with track ID and better formatting by@danielaskdd in#2170
feat: add options for Postgres connection by@kevinnkansah in#2172
feat: Add token tracking support to openai_embed function by@yrangana in#2181
Add knowledge graph manipulation endpoints by@NeelM0906 in#2183
Feat: Add Comprehensive Offline Deployment Solution by@danielaskdd in#2194

What's Fixed

Fix: Add file_path field to full_docs storage by@danielaskdd in#2171
Fixed typo in log message when creating new graph file by@aleksvujic in#2178
Fixed: Add PostgreSQL Connection Retry Mechanism with Network Robustness by@danielaskdd in#2192
Adding support for imagePullSecrets, envFrom, and deployment strategy in Helm chart by@tcyran in#2175
Hotfix: Preserve ordering in get_by_ids methods across all storage implementations by@danielaskdd in#2195
Update Web Dependencies by@kevinnkansah in#2193

New Contributors

@RooseveltAdvisors made their first contribution in#2168
@aleksvujic made their first contribution in#2178
@kevinnkansah made their first contribution in#2172
@yrangana made their first contribution in#2181
@NeelM0906 made their first contribution in#2183
@tcyran made their first contribution in#2175

Full Changelog:v1.4.9.1...v1.4.9.2

Contributors

RooseveltAdvisors, aleksvujic, and 5 other contributors

Assets2

3 people reacted

Movatterモバイル変換

Releases: HKUDS/LightRAG

v1.4.9.11

Hot Fixed

What's New

What's Fixed

New Contributors

Contributors

Uh oh!

v1.4.9.10

What's Changed

Contributors

Uh oh!

v1.4.9.9

Release Note V1.4.9.9

Important Notes

What's Breaking (for LightRAG Core integration only)

What's New

What's Changed

Contributors

Uh oh!

v1.4.9.8

What's New

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.9.7

Important Notes

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.9.6 Hotfix

What's Changed

Contributors

Uh oh!

v1.4.9.5

Important Notes

What's New

What's Fixed

New Contributors

Contributors

Uh oh!

v1.4.9.4

Important Notes: Eliminate Bottlenecks in Processing Large-scale Datasets

Auto Data Migration

What's New

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.9.3

Important Notes

What's Changed

Contributors

Uh oh!

v1.4.9.2

What's New

What's Fixed

New Contributors

Contributors

Uh oh!