Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

speech-interaction

Here are 5 public repositories matching this topic...

Language:All
Filter by language

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

  • UpdatedMay 19, 2025
  • Python

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.

  • UpdatedJan 8, 2025
  • Python

a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.

  • UpdatedApr 7, 2025
  • Python

[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

  • UpdatedApr 7, 2025
  • Python

web-based voice-controlled media player, designed to run in any modern browser (Chrome/Edge recommended).

  • UpdatedSep 20, 2025
  • JavaScript

Improve this page

Add a description, image, and links to thespeech-interaction topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thespeech-interaction topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp