Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Feat: Add PDF Decryption Support for Password-Protected Files#2296

Merged
danielaskdd merged 1 commit intoHKUDS:mainfrom
danielaskdd:pdf-decryption
Nov 1, 2025
Merged

Feat: Add PDF Decryption Support for Password-Protected Files#2296
danielaskdd merged 1 commit intoHKUDS:mainfrom
danielaskdd:pdf-decryption

Conversation

@danielaskdd
Copy link
Collaborator

Feat: Add PDF Decryption Support for Password-Protected Files

Summary

This PR adds support for processing password-protected PDF files in the document processing pipeline. Users can now decrypt and process encrypted PDFs by setting a password in the environment configuration.

Motivation

Previously, the system would fail to process encrypted PDF files without any clear indication of what went wrong. This enhancement allows users to work with password-protected documents, which is common in enterprise and academic environments where sensitive documents are often encrypted.

Changes Made

1. Configuration Management (lightrag/api/config.py)

  • Addedpdf_decrypt_password parameter toglobal_args
  • Reads fromPDF_DECRYPT_PASSWORD environment variable
  • Defaults toNone if not set

2. Document Processing (lightrag/api/routers/document_routes.py)

  • Modifiedpipeline_enqueue_file function to detect encrypted PDFs
  • Implemented decryption logic using PyPDF2'sdecrypt() method
  • Added comprehensive error handling for three scenarios:
    • No password provided: Clear error message directing users to setPDF_DECRYPT_PASSWORD
    • Incorrect password: Friendly error indicating the password is wrong
    • Decryption failure: Detailed error with exception information

3. Documentation (env.example)

  • AddedPDF_DECRYPT_PASSWORD configuration example
  • Included clear comments explaining the feature

Usage

Configuration

Add to your.env file:

# PDF decryption password for protected PDF filesPDF_DECRYPT_PASSWORD=your_password_here

Behavior

  • Unencrypted PDFs: Process normally (no change in behavior)
  • Encrypted PDFs with password set: Automatically decrypt and process
  • Encrypted PDFs without password: Fail gracefully with helpful error message
  • Encrypted PDFs with wrong password: Fail with clear indication of incorrect password

Error Messages

All error messages are user-friendly and appear in English:

  • "PDF is encrypted but no password provided - Please set PDF_DECRYPT_PASSWORD environment variable"
  • "Failed to decrypt PDF - incorrect password - The provided PDF_DECRYPT_PASSWORD is incorrect for this file"
  • "PDF decryption failed - Error during PDF decryption: [details]"

Technical Notes

  • Only affects PyPDF2 processing engine (DEFAULT mode)
  • DOCLING mode is unchanged
  • Password is accessed viaglobal_args.pdf_decrypt_password for consistency with other configuration
  • Backward compatible - no breaking changes for existing deployments

Testing Recommendations

  1. Test with unencrypted PDF - should work as before
  2. Test with encrypted PDF without password set - should show friendly error
  3. Test with encrypted PDF and correct password - should decrypt and process successfully
  4. Test with encrypted PDF and incorrect password - should show password error

Checklist

  • Code follows project style guidelines
  • Comments are in English
  • Environment variable documented inenv.example
  • Error handling is comprehensive and user-friendly
  • Backward compatible with existing functionality
  • Configuration managed throughglobal_args pattern

• Add PDF_DECRYPT_PASSWORD env variable• Check encryption status before reading• Handle decrypt errors gracefully• Log detailed error messages• Support both encrypted/plain PDFs
@danielaskdddanielaskdd merged commitece0398 intoHKUDS:mainNov 1, 2025
1 check passed
@danielaskdddanielaskdd deleted the pdf-decryption branchNovember 1, 2025 07:24
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

1 participant

@danielaskdd

Comments


[8]ページ先頭

©2009-2026 Movatter.jp