florence-2

Star

Here are 29 public repositories matching this topic...

Language:All

Filter by language

All29 Python21 Jupyter Notebook8

Sort:Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

roboflow /maestro

Star2.5k

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

transformers vqa objectdetection captioning fine-tuning multimodal vision-and-language phi-3-vision paligemma florence-2 qwen2-vl

UpdatedMar 17, 2025
Python

jhc13 /taggui

Star923

Tag manager and captioner for image datasets

image-captioning image-tagging tag-manager pyside6 stable-diffusion llava cogvlm florence-2

UpdatedFeb 22, 2025
Python

AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendly PyQt6 interface.

dataset-creation inpainting watermark-remover lama-cleaner florence-2

UpdatedJan 15, 2025
Python

autodistill /autodistill-grounded-sam-2

Star116

Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.

grounded-sam autodistill florence-2 segment-anything-2

UpdatedAug 7, 2024
Python

Ravi-Teja-konda /Surveillance_Video_Summarizer

Star105

VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for querying and analyzing video footage.

video ai summarization gradio vlm vision-and-language huggingface surviellance gpt-4 chatgpt gradio-python-llm florence-2

UpdatedSep 17, 2024
Python

Damarcreative /rem-wm

Sponsor

Star69

Rem-WM, a powerful watermark remover tool that leverages the capabilities of Microsoft Florence and Lama Cleaner models.

watermark lama-cleaner florence-2

UpdatedJan 28, 2025
Python

autodistill /autodistill-florence-2

Star62

Use Florence 2 to auto-label data for use in training fine-tuned object detection models.

object-detection zero-shot-object-detection autodistill florence-2

UpdatedAug 15, 2024
Python

retkowsky /florence-2

Star61

Florence-2

azure florence-2

UpdatedFeb 13, 2025
Jupyter Notebook

fireicewolf /wd-llm-caption-cli

Star33

A Python base cli tool for caption images with WD series, Joy-caption-pre-alpha,meta Llama 3.2 Vision Instruct and Qwen2 VL Instruct models.

image-caption wd14 llama3-vision florence-2 qwen2-vl joy-caption

UpdatedMar 18, 2025
Python

ANYANTUDRE /Florence-2-Vision-Language-Model

Star32

Florence-2 is a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks.

computer-vision deep-learning huggingface vision-language vision-transformer vision-transformer-models vision-language-model florence-2

UpdatedJul 3, 2024
Jupyter Notebook

sayedmohamedscu /Vision-language-models-VLM

Star19

vision language models finetuning notebooks & use cases (paligemma - florence .....)

computer-vision vlm florence finetuning multimodal colab-notebook finetune-llms paligemma florence-2 visionlanguage florence-finetuning

UpdatedSep 26, 2024
Jupyter Notebook

Iteranya /AktivaAI

Star9

Local LLM Discord Bot

ai chatbot discord-bot roleplay llama florence multimodal koboldcpp florence-2

UpdatedMar 16, 2025
Python

jacobmarks /fiftyone_florence2_plugin

Star9

Run SOTA Vision-Language Model Florence-2 on your data!

computer-vision ml transformer datacentric fiftyone-datasets vision-language-model florence-2

UpdatedJun 29, 2024
Python

mithunparab /text2segment_video

Star8

Simple Video Summarization using Text-to-Segment Anything (Florence2 + SAM2) This project provides a video processing tool that utilizes advanced AI models, specifically Florence2 and SAM2, to detect and segment specific objects or activities in a video based on textual descriptions.

raft video-summarization optical-flow segment-anything florence-2 sam2

UpdatedFeb 20, 2025
Python

nguyennpa412 /simple-multimodal-ai

Star5

Simple Gradio application integrated with Hugging Face Multimodals to support visual question answering chatbot and more features

docker text-to-speech computer-vision gradio vlm visual-question-answering llm mllm vision-foundation-model image-text-to-text florence-2 xtts-v2 mini-internvl

UpdatedAug 16, 2024
Python

sitamgithub-MSIT /TextSnap

Star4

TextSnap: Demo for Florence 2 model used in OCR tasks to extract and visualize text from images.

python artificial-intelligence optical-character-recognition gradio ocr-text-reader huggingface-transformers gradio-interface huggingface-spaces vision-language-model florence-2

UpdatedNov 20, 2024
Python

Ambruk-chan /DiscordBot

Star4

The Ultimate Local LLM Discord Bot!!!

ai discord-bot roleplay llm koboldcpp gbnf florence-2

UpdatedDec 6, 2024
Python

regiellis /ecko-cli

Star3

ecko-cli is a simple CLI tool that streamlines the process of processing images in a directory, generating captions, and saving them as text files. Additionally, it provides functionalities to create a JSONL file from images in the directory you specify. Images will be captioned using the Microsoft Florence-2-large model and ONNX

cli ai image-processing image-classification onnxruntime huggingface-transformers generative-ai ecko florence-2 ecko-cli