LongCat AI – Next-Generation Multi-Modal Models

Open-source MoE LLMs by Meituan: Flash-Chat, Flash-Thinking, Video, Audio-Codec, and Omni. Fast, efficient, and production-ready.

LongCat-Flash-Omni (November 2025)

First open-source real-time all-modality interaction model. Omni unifies text, image, audio, and video with a single end-to-end ScMoE backbone, enabling low-latency, streaming multi-modal understanding and generation with up to 128K context and robust multi-turn, long-horizon dialogue.

Modalities

  • Text: instruction following, reasoning, coding
  • Image: VQA, fine-grained recognition, OCR
  • Audio: speech understanding, streaming ASR
  • Video: temporal reasoning, event grounding

Architectural Highlights

  • Unified ScMoE: single trunk, expert routing across modalities
  • MDP training: modality-decoupled parallel schedule
  • Progressive fusion: curriculum for multi-modal alignment

Performance

  • Omni-Bench: open-source SOTA
  • WorldSense: open-source SOTA

Key Capabilities

  • Real-time: low-latency interactive streams (audio/video)
  • Long-context: 128K tokens, multi-turn long-session memory
  • Tool-use: agentic calls across modalities

Applications

  • Multi-modal assistants and voice agents
  • Visual Q&A and scene understanding
  • Real-time AI video customer support

Model Series

Flash-Chat

Foundation dialogue model (560B params, MoE). Achieves 100+ tokens/s on H800 GPUs with ~27B active params/token.

Released: Sept 1, 2025

Flash-Thinking

Enhanced reasoning with dual-path framework. 64.5% token savings in agentic scenarios.

Released: Sept 22, 2025

Video

DiT-based video generation. 5-minute coherent videos at 720p/30fps.

Released: Oct 27, 2025

Omni

All-modality real-time interaction. Text, image, audio, video unified.

Released: Nov 2025

Key Highlights

  • High-throughput inference: 100+ tokens/s on H800 GPUs
  • Zero-Computation Experts: Activates only ~27B params/token from 560B pool
  • Extended context: Up to 128K tokens
  • Open-source SOTA: Leading performance on Omni-Bench, WorldSense, MMLU, and more
  • Production-ready: Deployed across Meituan's services