Movatterモバイル変換

Skip to content

#

vlms

Here are 90 public repositories matching this topic...

Language:All

Filter by language

All90 Python65 Jupyter Notebook9 HTML2 Java1 JavaScript1 R1 Scala1 TypeScript1 PDDL1

Sort:Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

yzhao062 /anomaly-detection-resources

Anomaly detection related books, papers, videos, and toolboxes. Last update late 2025 for LLM and VLM works!

machine-learning data-mining awesome awesome-list outlier-detection unsupervised-learning fraud-detection time-series-analysis vlm anomaly-detection fraud outlier outlier-ensembles graph-neural-networks large-language-models llm vlms

UpdatedNov 25, 2025
Python

oumi-ai /oumi

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

evaluation inference llama fine-tuning sft dpo slms llms vlms gpt-oss gpt-oss-120b gpt-oss-20b

UpdatedFeb 20, 2026
Python

NanoNets /docext

An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)

nlp machine-learning ocr extraction document onprem document-analysis table-extraction unstructured-data rag onpremise llms vlms document-information-extraction ocr-onpremise document-data-extraction onprem-vision onprem-ocr llm-ocr ocr-benchmark

UpdatedAug 25, 2025
Python

yueliu1999 /Awesome-Jailbreak-on-LLMs

Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.

security privacy ai jailbreak safety vlm llm llms vlms

UpdatedFeb 6, 2026

intel /auto-round

🎯An accuracy-first, highly efficient quantization toolkit for LLMs, designed to minimize quality degradation across Weight-Only Quantization, MXFP4, NVFP4, GGUF, and adaptive schemes.

transformers rounding quantization int4 llms vllm gguf vlms sglang mxfp4 nvfp4

UpdatedFeb 14, 2026
Python

JIA-Lab-research /VisionZip

Official repository for VisionZip (CVPR 2025)

efficiency multi-modality vision-language-model vlms

UpdatedJul 21, 2025
Python

tianyi-lab /HallusionBench

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

benchmark benchmarks lmm hallucination gpt-4 large-language-models llm llava large-vision-language-models vlms gpt-4v

UpdatedOct 14, 2025
Python

cequence-io /openai-scala-client

Scala client for OpenAI API and other major LLM providers

scala gemini openai nlp-library llms chatgpt anthropic aws-bedrock vlms vertex-ai-gemini-api gemini-ai perplexity-api groq-api anthropic-api openai-api-client

UpdatedFeb 12, 2026
Scala

Beckschen /ViTamin

[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"

vlms scalable-vision-encoder

UpdatedJun 9, 2024
Python

InternScience /OmniCaptioner

Official Repository of OmniCaptioner

multi-modal captioning-images caption-generation vlms reasoning-models deepseek-r1 multi-modal-deepseek-r1

UpdatedApr 23, 2025
Python

TUM-AVS /FM-AD-Survey

This repository collects research papers of large Foundation Models for Scenario Generation and Analysis in Autonomous Driving. The repository will be continuously updated to track the latest update.

autonomous-driving world-models scenario-analysis diffusion-models scenario-generation foundation-models llms vlms mllms

UpdatedFeb 6, 2026

Roots-Automation /GutenOCR

Open-source tools for training and evaluating Vision Language Models for OCR

ocr multigpu llms vllm vlms vlm-ocr

UpdatedJan 28, 2026
Python

MCG-NJU /AWT

[NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation

computer-vision transfer-learning clip video-understanding zero-shot-learning open-set-recognition vlms siglip

UpdatedOct 5, 2024
Python

aim-uofa /SegAgent

[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

agent segment-anything vlms mllms

UpdatedAug 8, 2025
Python

mbzuai-oryx /KITAB-Bench

[ACL 2025 🔥] A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding

benchmark ocr vqa pdf-to-text arabic table-detection layout-detection vlms

UpdatedMay 24, 2025
Python

thubZ09 /vision-language-model-research

Hub for researchers exploring VLMs and Multimodal Learning:)

nlp machine-learning research computer-vision deep-learning multimodal-learning multimodal-deep-learning vision-language multimodal-large-language-models vlms multimodal-ai

UpdatedFeb 17, 2026

foundation-multimodal-models /CAL

[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment

vlms contrastive-alignment

UpdatedSep 26, 2024
Python

video-db /ocr-benchmark

Benchmarking Vision-Language Models on OCR tasks in Dynamic Video Environments

benchmark ocr arxiv research-paper easyocr rapidocr vlms videodb vlm-ocr

UpdatedFeb 14, 2025
Python

AakashKumarNain /nanoGPTJAX

Implementing scalable LLMs in pure JAX (no third-party libraries)

transformer jax llms vlms

UpdatedFeb 19, 2026
Python

dimitrismallis /CAD-Assistant

Code for our ICCV 2025 paper "CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers."

cad freecad llm vlms

UpdatedOct 30, 2025
Python

Improve this page

Add a description, image, and links to thevlms topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thevlms topic, visit your repo's landing page and select "manage topics."

[8]ページ先頭

©2009-2026 Movatter.jp