🤗Hugging Face • 𝕏Follow me on X • 💻Kaggle • 📙Multimodal Outpost • ⚡Gist
Hi, I'm a Machine Learning Engineer, Hugging Face Fellow ML 🤗, Computer Vision Enthusiast.
- FLUX-LoRA-DLC: FLUX.1-dev diffusion model with 255+ community LoRAs, 1.09K+ likes, 70K+ runs.[Collection]
- Multimodal-OCR: OCR for images and videos using state-of-the-art vision-language models, 40K+ runs, 90K+ visits.[Collection]
- Multimodal-VLM-Thinking: VLMs for captioning, OCR, reasoning, and multimodal tasks, 2.06K+ runs, 11.2K+ visits.[Collection]
- Qwen3-VL-Outpost: VLM for image & video understanding with multilingual support, 6.2K+ runs, 49.9K+ visits.[Collection]
- Flux Realism: Hyper-realistic image generation with FLUX.1-dev and Super Realism LoRA, 39.5K+ runs, 127.7K+ visits.[Collection]
- Nano-Banana-AIO: Minimalistic Gemini API app to experience Google’s NanoBanana functionalities.[Collection]
- Gliese-OCR-7B-Post1.0: Enhanced document retrieval, content extraction, and analysis, built on Camel-Doc-OCR-062825.[Collection]
- DeepCaption-VLA-7B: Generates precise, descriptive image captions highlighting visual properties, object attributes.[Collection]
- Camel-Doc-OCR: Document retrieval, content extraction, and analysis. (v2 080125)[Collection]
- SigLIP2-0.1B-DownStream: Domain-specific image classification models fine-tuned from siglip2 for multi-label tasks.[Base]
- Lumian2-VLR-7B: VLM for fine-grained multimodal reasoning, image/video captioning, and document comprehension with explainable step-by-step reasoning.[Demo]
- Galactic-Qwen-14B: Top mid-range 14B model, ranked 59th, overall score 43.56.[Leaderboard]
- Gauss-Opus-14B: Strong in math, ranked 356th, MATH Level 5 score 57.55.[Leaderboard]
- Sombrero-Opus-14B: All-rounder mid-range 14B, ranked 104th, score 42.32.[Leaderboard]
- Dinobot-Opus-14B: IFEval score 82.40, ranked 132nd, overall 41.77.[Leaderboard]
- Qwen2-VL-OCR-2B: Edge-device VLM for handwriting, LaTeX, bills, and receipts, 250k+ downloads.[Run Demo]
- Stranger Vision: Community for model modification and experimentation, < 1K downloads.[Collection]
- Stranger Zone: Illustration adapters for diffusion models, 2M+ downloads.[Collection]
- Stranger Guard: Image safety-guard models, 10k+ downloads.[Collection]
- Stranger Operations: Model Fostering, Operations, and Cycle
- Stranger Tools: Tools, Wheels, Fun
PinnedLoading
- Multimodal-Outpost-Notebooks
Multimodal-Outpost-Notebooks PublicThis repository contains a curated collection of notebooks for implementing state-of-the-art multimodal Vision-Language Models (VLMs).
- FineTuning-SigLIP-2
FineTuning-SigLIP-2 PublicFine-Tuning SigLIP 2 for Single/Multi-Label Image Classification. Image classification vision-language encoder model fine-tuned for Image Classification Tasks
- OCR-ReportLab-Notebooks
OCR-ReportLab-Notebooks PublicA dedicated Colab notebooks to experiment (Nanonets OCR, Monkey OCR, OCRFlux 3B, Typhoo OCR 3B & more..) On T4 GPU - free tier
- Flux-LoRA-DLC
Flux-LoRA-DLC PublicExperience the power of the FLUX.1-dev diffusion model combined with a massive collection of 255+ community-created LoRAs! This Gradio application provides an easy-to-use interface to explore diver…
- Qwen-Image-Edit-2509-LoRAs-Fast
Qwen-Image-Edit-2509-LoRAs-Fast PublicQwen-Image-Edit-2509-LoRAs-Fast is a high-performance, user-friendly web application built with Gradio that leverages the advanced Qwen/Qwen-Image-Edit-2509 model from Hugging Face for seamless ima…
- FLUX-REALISM
FLUX-REALISM PublicA Gradio-based web application for generating hyper-realistic images using FLUX.1-dev with Super Realism LoRA enhancement. This application provides an intuitive interface for creating high-quality…
If the problem persists, check theGitHub status page orcontact support.
Uh oh!
There was an error while loading.Please reload this page.





