LongCat-Flash-Chat
Foundation dialogue model (Released: September 1, 2025)
Overview
The foundation dialogue model with 560B parameters in a Mixture-of-Experts (MoE) architecture. It activates approximately 18.6B–31.3B parameters per token (averaging ~27B) through Zero-Computation Experts, achieving competitive quality with high throughput and low latency.
Key Features
- 128K context length: Supports complex, multi-document tasks
- 100+ tokens/s: High throughput on H800 GPUs
- Zero-Computation Experts: Cost-efficient parameter activation
- Strong capabilities: Instruction following, reasoning, and coding
- Agentic tool-use: Excellent performance in tool-call scenarios