Inception: Mercury

INCEPTION Developer Architecture Profile

Intelligence (ELO)1120Chatbot Arena Verified

Max Context128,000Tokens

API Cost / 1M$1.00Blended Prompt + Completion

Model Capabilities

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude 3.5 Haiku while matching their performance. Mercury's speed enables developers to provide responsive user experiences, including with voice agents, search interfaces, and chatbots. Read more in the [blog post] (https://www.inceptionlabs.ai/blog/introducing-mercury) here.

Granular Pricing Matrix

Input Tokens (Prompt)$0.25 / 1M

Output Tokens (Completion)$0.75 / 1M

Pricing data via OpenRouter. Sync: 3/16/2026

Evaluate Competitors

VS Engine MatchupInception: Mercury vs Z.ai: GLM 5 Turbo VS Engine MatchupInception: Mercury vs Inception: Mercury 2 VS Engine MatchupInception: Mercury vs Qwen: Qwen3.5-27B VS Engine MatchupInception: Mercury vs Qwen: Qwen3.5-122B-A10B VS Engine MatchupInception: Mercury vs AionLabs: Aion-2.0 VS Engine MatchupInception: Mercury vs Qwen: Qwen3.5 Plus 2026-02-15