Chat Template Reference

Prompt prefix formats for LongCat-Flash models

All Docs Quick Start

Template details live in each model's tokenizer_config.json on Hugging Face. Below are the standard LongCat-Flash patterns from the official repository.

First turn

[Round 0] USER:{query} ASSISTANT:

With system prompt

SYSTEM:{system_prompt} [Round 0] USER:{query} ASSISTANT:

Multi-turn

Concatenate prior rounds; end with the latest user query and an open ASSISTANT: suffix. Round index starts at 0.

SYSTEM:{system_prompt} [Round 0] USER:{q} ASSISTANT:{r}</longcat_s> ... [Round N-1] USER:{q} ASSISTANT:{r}</longcat_s> [Round N] USER:{q} ASSISTANT:

Tool call envelope

{tool_description}

## Messages
SYSTEM:{system_prompt} [Round 0] USER:{query} ASSISTANT:

<longcat_tool_call>{"name": <function-name>, "arguments": <args-dict>}</longcat_tool_call>

For agentic models like Flash-Thinking, follow the model card for Re-thinking mode and tool schemas.

Python Quick Start
vLLM deployment
Flash-Chat on Hugging Face

Chat Template Reference

First turn

With system prompt

Multi-turn

Tool call envelope

Related