- Notifications
You must be signed in to change notification settings - Fork0
An OCR application that extracts text from images
License
arkeodev/image-text-extractor
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
An OCR application that extracts text from images.
- Extract text from uploaded images
- Process multiple image formats (PNG, JPG, JPEG, GIF, WEBP)
- User-friendly Streamlit interface
- RESTful API endpoints
- Integration with Langchain for advanced text processing
- Together AI Vision model integration
- Python 3.12 or higher
- Poetry package manager
- Together AI API key
1.Clone the repository:
git clone https://github.com/yourusername/ImageTextExtractor.gitcd ImageTextExtractor2.Install dependencies using Poetry:
poetry install
1.Start the FastAPI backend:
poetry run python main.py
2.In a new terminal, launch the Streamlit interface:
poetry run streamlit run ui.py
3.Open your browser and navigate tohttp://localhost:8501
4.Enter your Together AI API key
5.Upload an image and wait for the results
The application exposes a REST API endpoint for OCR processing.
Request:
- URL:
http://localhost:8000/ocr - Method:
POST - Content-Type:
multipart/form-data
Parameters:
file: Image file (supported formats: PNG, JPG, JPEG, GIF, WEBP)api_key: Together AI API keysystem_prompt: (Optional) Custom prompt for the vision model
Example using curl:
curl -X POST http://localhost:8000/ocr \-F"file=@/path/to/your/image.jpg" \-F"api_key=your_together_ai_api_key" \-F"system_prompt=Convert the provided image into text"
Response:
poetry run pytest
The application uses the following configurations (defined inconfig.py):
LOGGING_LEVEL: Default is "INFO"SUPPORTED_IMAGE_TYPES: [".png", ".jpg", ".jpeg", ".gif", ".webp"]TOGETHER_MODEL_NAME: "meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo"
This project is licensed under the MIT License - see the LICENSE file for details.
- Together AI for providing the vision model
- Langchain for the AI integration framework
- Streamlit for the user interface
- FastAPI for the REST API implementation
About
An OCR application that extracts text from images
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.