florence-2
Here are 29 public repositories matching this topic...
Language:All
Sort:Most stars
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
- Updated
Mar 17, 2025 - Python
Tag manager and captioner for image datasets
- Updated
Feb 22, 2025 - Python
AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendly PyQt6 interface.
- Updated
Jan 15, 2025 - Python
Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.
- Updated
Aug 7, 2024 - Python
VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for querying and analyzing video footage.
- Updated
Sep 17, 2024 - Python
Rem-WM, a powerful watermark remover tool that leverages the capabilities of Microsoft Florence and Lama Cleaner models.
- Updated
Jan 28, 2025 - Python
Use Florence 2 to auto-label data for use in training fine-tuned object detection models.
- Updated
Aug 15, 2024 - Python
A Python base cli tool for caption images with WD series, Joy-caption-pre-alpha,meta Llama 3.2 Vision Instruct and Qwen2 VL Instruct models.
- Updated
Mar 18, 2025 - Python
Florence-2 is a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks.
- Updated
Jul 3, 2024 - Jupyter Notebook
vision language models finetuning notebooks & use cases (paligemma - florence .....)
- Updated
Sep 26, 2024 - Jupyter Notebook
Local LLM Discord Bot
- Updated
Mar 16, 2025 - Python
Run SOTA Vision-Language Model Florence-2 on your data!
- Updated
Jun 29, 2024 - Python
Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.
- Updated
Feb 20, 2025 - Python
Simple Gradio application integrated with Hugging Face Multimodals to support visual question answering chatbot and more features
- Updated
Aug 16, 2024 - Python
TextSnap: Demo for Florence 2 model used in OCR tasks to extract and visualize text from images.
- Updated
Nov 20, 2024 - Python
The Ultimate Local LLM Discord Bot!!!
- Updated
Dec 6, 2024 - Python
ecko-cli is a simple CLI tool that streamlines the process of processing images in a directory, generating captions, and saving them as text files. Additionally, it provides functionalities to create a JSONL file from images in the directory you specify. Images will be captioned using the Microsoft Florence-2-large model and ONNX
- Updated
Nov 12, 2024 - Python
Microsoft の軽量VLMのFlorence-2のColaboratory上でのサンプル
- Updated
Aug 30, 2024 - Jupyter Notebook
The Power of Florence-2 with OpenVINO & FiftyOne: Real-World Applications in Image Analysis
- Updated
Sep 10, 2024 - Python
Improve this page
Add a description, image, and links to theflorence-2 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theflorence-2 topic, visit your repo's landing page and select "manage topics."