Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

image-text-retrieval

Here are 39 public repositories matching this topic...

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

  • UpdatedSep 22, 2025
  • Python

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

  • UpdatedAug 5, 2024
  • Jupyter Notebook

Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine

  • UpdatedMar 28, 2024
  • Kotlin
PicQuery

🔍 Search local images with natural language on Android, powered by OpenAI's CLIP model. / 在 Android 上用自然语言搜索本地图片 (基于 OpenAI 的 CLIP 模型)

  • UpdatedJul 8, 2025
  • Kotlin

[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”

  • UpdatedApr 11, 2024
  • Python

Research Code for Multimodal-Cognition Team in Ant Group

  • UpdatedOct 14, 2025
  • Python

PyTorch code for BagFormer: Better Cross-Modal Retrieval via bag-wise interaction

  • UpdatedJan 14, 2023
  • Python

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)

  • UpdatedMay 8, 2023
  • Python

使用OpenCV+onnxruntime部署中文clip做以文搜图,给出一句话来描述想要的图片,就能从图库中搜出来符合要求的图片。包含C++和Python两个版本的程序

  • UpdatedJan 15, 2024
  • C++

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

  • UpdatedJun 13, 2023
  • Python
image-captioning

Image captioning using python and BLIP

  • UpdatedAug 16, 2023
  • Python

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

  • UpdatedAug 18, 2024
  • Python

Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"

  • UpdatedDec 5, 2022
  • Python

[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”

  • UpdatedApr 11, 2024
  • Python

[EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality

  • UpdatedOct 8, 2024
  • Python

Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text matching/retrieval models.

  • UpdatedAug 15, 2025
  • Python

In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Projection Learning model and study their performance. We also propose a modified Deep Cross-Modal Projection Learning model that uses a different image feature extractor. We evaluate the model’s performance on im…

  • UpdatedAug 23, 2021
  • Jupyter Notebook

Improve this page

Add a description, image, and links to theimage-text-retrieval topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with theimage-text-retrieval topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp