LongCat AI – Next-Generation Multi-Modal Models

Open-source MoE LLMs by Meituan: Flash-Chat, Flash-Thinking, Video, Audio-Codec, and Omni. Fast, efficient, and production-ready.

Explore Models Get Started

LongCat-Flash-Omni (November 2025)

First open-source real-time all-modality interaction model. Omni unifies text, image, audio, and video with a single end-to-end ScMoE backbone, enabling low-latency, streaming multi-modal understanding and generation with up to 128K context and robust multi-turn, long-horizon dialogue.

Modalities

Text: instruction following, reasoning, coding
Image: VQA, fine-grained recognition, OCR
Audio: speech understanding, streaming ASR
Video: temporal reasoning, event grounding

Architectural Highlights

Unified ScMoE: single trunk, expert routing across modalities
MDP training: modality-decoupled parallel schedule
Progressive fusion: curriculum for multi-modal alignment

Performance

Omni-Bench: open-source SOTA
WorldSense: open-source SOTA

Key Capabilities

Real-time: low-latency interactive streams (audio/video)
Long-context: 128K tokens, multi-turn long-session memory
Tool-use: agentic calls across modalities

Applications

Multi-modal assistants and voice agents
Visual Q&A and scene understanding
Real-time AI video customer support

Model Series

Flash-Chat

Foundation dialogue model (560B params, MoE). Achieves 100+ tokens/s on H800 GPUs with ~27B active params/token.

Released: Sept 1, 2025

Flash-Thinking

Enhanced reasoning with dual-path framework. 64.5% token savings in agentic scenarios.

Released: Sept 22, 2025

Video

DiT-based video generation. 5-minute coherent videos at 720p/30fps.

Released: Oct 27, 2025

Omni

All-modality real-time interaction. Text, image, audio, video unified.

Released: Nov 2025

View All Models

Key Highlights

High-throughput inference: 100+ tokens/s on H800 GPUs
Zero-Computation Experts: Activates only ~27B params/token from 560B pool
Extended context: Up to 128K tokens
Open-source SOTA: Leading performance on Omni-Bench, WorldSense, MMLU, and more
Production-ready: Deployed across Meituan's services

Learn About Technology View Performance