Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
#

image-captioning

Here are 878 public repositories matching this topic...

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

  • UpdatedAug 5, 2024
  • Jupyter Notebook

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

  • UpdatedAug 20, 2024
  • Python

Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

  • UpdatedJul 28, 2022
  • Python

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

  • UpdatedApr 24, 2024
  • Python

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences.https://huggingface.co/spaces/TencentARC/Caption-Anythinghttps://huggingface.co/spaces/VIPLab/Caption-Anything

  • UpdatedAug 29, 2023
  • Python

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

  • UpdatedFeb 3, 2023
  • Jupyter Notebook

Simple Swift class to provide all the configurations you need to create custom camera view in your app

  • UpdatedJul 19, 2024
  • Swift

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

  • UpdatedJan 17, 2024
  • Python
Oscar

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

  • UpdatedOct 5, 2023
  • Python

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

  • UpdatedFeb 27, 2023
  • Python

Tag manager and captioner for image datasets

  • UpdatedFeb 22, 2025
  • Python

TensorFlow Implementation of "Show, Attend and Tell"

  • UpdatedJul 28, 2018
  • Jupyter Notebook

👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]

  • UpdatedFeb 29, 2024
  • Python

[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

  • UpdatedJan 1, 2024
  • Python

PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)

  • UpdatedMay 18, 2023
  • Python

Meshed-Memory Transformer for Image Captioning. CVPR 2020

  • UpdatedDec 21, 2022
  • Python

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

  • UpdatedOct 31, 2020
  • Python

Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation

  • UpdatedFeb 13, 2025
  • Python

Improve this page

Add a description, image, and links to theimage-captioning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with theimage-captioning topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp