image-text-retrieval
Here are 39 public repositories matching this topic...
Language:All
Sort:Most stars
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
- Updated
Sep 22, 2025 - Python
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
- Updated
Aug 29, 2025 - Jupyter Notebook
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
- Updated
Aug 5, 2024 - Jupyter Notebook
Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine
- Updated
Mar 28, 2024 - Kotlin
🔍 Search local images with natural language on Android, powered by OpenAI's CLIP model. / 在 Android 上用自然语言搜索本地图片 (基于 OpenAI 的 CLIP 模型)
- Updated
Jul 8, 2025 - Kotlin
The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
- Updated
Sep 25, 2025
[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”
- Updated
Apr 11, 2024 - Python
Official implementation of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images"
- Updated
Dec 18, 2023 - Python
Research Code for Multimodal-Cognition Team in Ant Group
- Updated
Oct 14, 2025 - Python
PyTorch code for BagFormer: Better Cross-Modal Retrieval via bag-wise interaction
- Updated
Jan 14, 2023 - Python
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
- Updated
May 8, 2023 - Python
使用OpenCV+onnxruntime部署中文clip做以文搜图,给出一句话来描述想要的图片,就能从图库中搜出来符合要求的图片。包含C++和Python两个版本的程序
- Updated
Jan 15, 2024 - C++
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
- Updated
Jun 13, 2023 - Python
Image captioning using python and BLIP
- Updated
Aug 16, 2023 - Python
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
- Updated
Aug 18, 2024 - Python
Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"
- Updated
Dec 5, 2022 - Python
[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”
- Updated
Apr 11, 2024 - Python
[EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
- Updated
Oct 8, 2024 - Python
Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text matching/retrieval models.
- Updated
Aug 15, 2025 - Python
In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Projection Learning model and study their performance. We also propose a modified Deep Cross-Modal Projection Learning model that uses a different image feature extractor. We evaluate the model’s performance on im…
- Updated
Aug 23, 2021 - Jupyter Notebook
Improve this page
Add a description, image, and links to theimage-text-retrieval topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theimage-text-retrieval topic, visit your repo's landing page and select "manage topics."