LLM Releases
โ† Catalog

Qwen3.5-0.8B

Available
Alibaba (Qwen)Open source

The smallest Qwen3.5 model โ€” a dense 0.8B designed for the most constrained on-device deployments, operating in non-thinking (instruct) mode by default. Apache-2.0, with native vision, a 262K-token context, and the family's hybrid thinking / non-thinking mode; needs roughly 2GB of VRAM and runs under 2GB at 4-bit, targeting smartphones and embedded hardware. Notable for a sub-1B model, it still scores ~26% on MMMU-Pro multimodal reasoning.

Specifications

License
Open source ยท Apache-2.0
Weights
Downloadable
Architecture
Dense
Parameters
0.8B
Context window
262K tokens
Max output
โ€”
Knowledge cutoff
โ€”
Price (in / out, $/M)
โ€”
Modalities
TextVisionCode

Benchmarks

No benchmark scores recorded yet. Spotted some? Submit a correction.

Vendor-reported figures are claims until independently verified. See methodology.