Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

multimodal

Here are 1,262 public repositories matching this topic...

anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.

  • UpdatedJul 18, 2025
  • JavaScript

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

  • UpdatedAug 12, 2024
  • Python
serve

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

  • UpdatedJul 3, 2025
  • Python

Janus-Series: Unified Multimodal Understanding and Generation Models

  • UpdatedFeb 1, 2025
  • Python

AI app store powered by 24/7 desktop history. open source | 100% local | dev friendly | 24/7 screen, mic recording

  • UpdatedJul 16, 2025
  • TypeScript

The Open-sourced Multimodal AI Agent Stack connecting Cutting-edge AI Models and Agent Infra.

  • UpdatedJul 18, 2025
  • TypeScript

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

  • UpdatedJul 18, 2025
  • Python
rerun

Visualize streams of multimodal data. Free, fast, easy to use, and simple to integrate. Built in Rust.

  • UpdatedJul 18, 2025
  • Rust

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4v, Phi4, ...) (AAAI 2025).

  • UpdatedJul 18, 2025
  • Python
BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

  • UpdatedJul 18, 2025
  • Python
big-AGI

AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. It features AI personas, AGI functions, multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.

  • UpdatedJul 17, 2025
  • TypeScript

This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)

  • UpdatedApr 22, 2024
  • Python

notes for software engineers getting up to speed on new AI developments. Serves as datastore forhttps://latent.space writing, and product brainstorming, but has cleaned up canonical references under the /Resources folder.

  • UpdatedJun 27, 2025
  • HTML

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

  • UpdatedApr 24, 2025
  • Python

Solve Visual Understanding with Reinforced VLMs

  • UpdatedJun 26, 2025
  • Python
pyspur

A visual playground for agentic workflows: Iterate over your agents 10x faster

  • UpdatedJul 6, 2025
  • TypeScript
swarms

Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%

  • UpdatedOct 29, 2024
  • Python

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

  • UpdatedJul 3, 2025
  • Python

Improve this page

Add a description, image, and links to themultimodal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with themultimodal topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp