image-captioning
Here are 878 public repositories matching this topic...
Language:All
Sort:Most stars
LAVIS - A One-stop Library for Language-Vision Intelligence
- Updated
Nov 18, 2024 - Jupyter Notebook
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
- Updated
Aug 5, 2024 - Jupyter Notebook
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
- Updated
Aug 20, 2024 - Python
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
- Updated
Jul 28, 2022 - Python
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
- Updated
Apr 24, 2024 - Python
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences.https://huggingface.co/spaces/TencentARC/Caption-Anythinghttps://huggingface.co/spaces/VIPLab/Caption-Anything
- Updated
Aug 29, 2023 - Python
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
- Updated
Feb 3, 2023 - Jupyter Notebook
Simple Swift class to provide all the configurations you need to create custom camera view in your app
- Updated
Jul 19, 2024 - Swift
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
- Updated
Jan 17, 2024 - Python
Oscar and VinVL
- Updated
Aug 28, 2023 - Python
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
- Updated
Oct 5, 2023 - Python
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
- Updated
Feb 27, 2023 - Python
Tag manager and captioner for image datasets
- Updated
Feb 22, 2025 - Python
TensorFlow Implementation of "Show, Attend and Tell"
- Updated
Jul 28, 2018 - Jupyter Notebook
👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]
- Updated
Feb 29, 2024 - Python
[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations
- Updated
Jan 1, 2024 - Python
PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
- Updated
May 18, 2023 - Python
Meshed-Memory Transformer for Image Captioning. CVPR 2020
- Updated
Dec 21, 2022 - Python
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
- Updated
Oct 31, 2020 - Python
Improve this page
Add a description, image, and links to theimage-captioning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theimage-captioning topic, visit your repo's landing page and select "manage topics."