Qwen3.5-0.8B

Available

Alibaba (Qwen)Open source

The smallest Qwen3.5 model — a dense 0.8B designed for the most constrained on-device deployments, operating in non-thinking (instruct) mode by default. Apache-2.0, with native vision, a 262K-token context, and the family's hybrid thinking / non-thinking mode; needs roughly 2GB of VRAM and runs under 2GB at 4-bit, targeting smartphones and embedded hardware. Notable for a sub-1B model, it still scores ~26% on MMMU-Pro multimodal reasoning.

Official page ↗Model card ↗

Specifications

License: Open source · Apache-2.0
Weights: Downloadable
Architecture: Dense
Parameters: 0.8B
Context window: 262K tokens
Max output: —
Knowledge cutoff: —
Price (in / out, $/M): —
Modalities: TextVisionCode

Benchmarks

No benchmark scores recorded yet. Spotted some? Submit a correction.

Vendor-reported figures are claims until independently verified. See methodology.