ICTNLP
- 256 followers
- Beijing, China
- http://nlp.ict.ac.cn
- ict_nlp@ict.ac.cn
PinnedLoading
- LLaMA-Omni
LLaMA-Omni PublicLLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
- StreamSpeech
StreamSpeech PublicStreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
- LLaVA-Mini
LLaVA-Mini PublicLLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Repositories
- PSO-Merging Public
PSO-Merging is an innovative deep model fusion method that uses particle swarm optimization algorithm to automatically find optimal model fusion weights.
Uh oh!
There was an error while loading.Please reload this page.
ictnlp/PSO-Merging’s past year of commit activity - FastLongSpeech Public
FastLongSpeech is a novel framework designed to extend the capabilities of Large Speech-Language Models for efficient long-speech processing without necessitating dedicated long-speech training data.
Uh oh!
There was an error while loading.Please reload this page.
ictnlp/FastLongSpeech’s past year of commit activity - LLaVA-Mini Public
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Uh oh!
There was an error while loading.Please reload this page.
ictnlp/LLaVA-Mini’s past year of commit activity - StreamSpeech Public
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
ictnlp/StreamSpeech’s past year of commit activity - Stream-Omni Public
Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across various modality combinations.
Uh oh!
There was an error while loading.Please reload this page.
ictnlp/Stream-Omni’s past year of commit activity Uh oh!
There was an error while loading.Please reload this page.
ictnlp/FlexRAG’s past year of commit activity - SLED-TTS Public
Streamable Text-to-Speech model using a language modeling approach, without vector quantization
Uh oh!
There was an error while loading.Please reload this page.
ictnlp/SLED-TTS’s past year of commit activity - MonoAttn-Transducer Public
Code for ICML25 Paper "Overcoming Non-monotonicity in Transducer-based Streaming Generation"
ictnlp/MonoAttn-Transducer’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.