LLM Releases

Downloadable models

All reports

Local LLM shortlist

A practical shortlist of downloadable releases with compact total size or low active-parameter MoE designs, meant for local evaluation or self-hosted deployments.

Memory estimates are rough weight-only estimates: about 0.55GB per billion parameters for 4-bit quantization and 2.05GB per billion for FP16. Runtime overhead, KV cache, batching, serving stack, and context length can materially increase requirements.

Single-workstation candidates

ModelLabParamsINT4 roughFP16 roughContextLicenseReleasedSource
LFM2 1.2BLiquid1.17B1GB3GB33KLFM Open License v1.0Nov 28, 2025source
SmolLM2 1.7BHugging Face1.7B1GB4GBApache-2.0Nov 4, 2024source
Stable LM 2 1.6BStability1.6B1GB4GBStability AI MembershipJan 19, 2024source
TinyLlama 1.1B ChatTinyLlama1.1B1GB3GBApache-2.0Jan 1, 2024source
Phi-1Microsoft1.3B1GB3GBMITJun 21, 2023source
GPT-2OpenAI1.5B1GB4GB1KMITNov 5, 2019source
BERTDeepMind0.34B1GB1GB512Apache-2.0Oct 11, 2018source
SmolLM3 3BHugging Face3B2GB7GB128KApache-2.0Jul 8, 2025source
Sarvam-1Sarvam2B2GB5GBSarvam AI LicenseOct 22, 2024source
Phi-2Microsoft2.7B2GB6GBMITDec 12, 2023source
Phi-3 MiniMicrosoft3.8B3GB8GB128KMITApr 23, 2024source
Qwen2.5-Omni-7BQwen7B4GB15GBQwen LicenseMar 26, 2025source
OLMoE 1B-7BAi27B · 1B active4GB15GBApache-2.0Sep 3, 2024source
CodeGemma 7BDeepMind7B4GB15GB8KGemma Terms of UseApr 9, 2024source
Gemma 7BDeepMind7B4GB15GB8KGemma Terms of UseFeb 21, 2024source
OLMo 7BAi27B4GB15GB4KApache-2.0Feb 1, 2024source
OpenChat 3.5OpenChat7B4GB15GBApache-2.0Jan 6, 2024source
OpenHathi-7BSarvam7B4GB15GBLlama 2 Community LicenseDec 12, 2023source
Mistral 7BMistral7B4GB15GB8KApache-2.0Sep 27, 2023source
Qwen-7BQwen7B4GB15GB32KTongyi Qianwen LicenseAug 3, 2023source
ChatGLM2-6BZ.ai6B4GB13GB32KChatGLM2 LicenseJun 25, 2023source
MPT-7BDatabricks7B4GB15GB2KApache-2.0May 5, 2023source
ChatGLM-6BZ.ai6B4GB13GB2KChatGLM LicenseMar 14, 2023source
Granite 3.3 8BIBM8B5GB17GB128KApache-2.0Apr 30, 2025source
Granite 3.2 8BIBM8B5GB17GB128KApache-2.0Feb 26, 2025source
DeepHermes 3 Llama 3 8BNous8B5GB17GB8KLlama 3 Community LicenseFeb 18, 2025source
Dolphin 3.0 Llama 3.1 8BCognitive8B5GB17GB128KLlama 3.1 Community LicenseFeb 2, 2025source
Granite 3.1 8BIBM8B5GB17GB128KApache-2.0Dec 18, 2024source
Command R7BCohere8B5GB17GB128KCC-BY-NC 4.0Dec 13, 2024source
Granite 3.0 8BIBM8B5GB17GB4KApache-2.0Oct 21, 2024source
Yi-Coder-9B01.AI9B5GB19GB128KYi LicenseSep 5, 2024source
EXAONE 3.0 7.8BLG AI7.8B5GB16GBEXAONE AI Model LicenseAug 7, 2024source
MiniCPM-V 2.6OpenBMB8B5GB17GBMiniCPM Model LicenseAug 2, 2024source
GLM-4-9BZ.ai9B5GB19GB128KGLM-4 LicenseJun 5, 2024source
Kimi-Audio-7B-InstructMoonshot10B6GB21GBMITApr 25, 2025source
Falcon 3 10BTII10B6GB21GB32KTII Falcon License 2.0Dec 17, 2024source
Pixtral 12BMistral12B7GB25GB128KApache-2.0Sep 17, 2024source
Mistral NeMoMistral12B7GB25GB128KApache-2.0Jul 18, 2024source
Falcon 2 11BTII11B7GB23GB8KTII Falcon LicenseMay 13, 2024source
Phi-4 ReasoningMicrosoft14B8GB29GBMITApr 30, 2025source
Phi-4Microsoft14B8GB29GB16KMITDec 12, 2024source
LLaVA 1.5 13BLLaVA13B8GB27GBLlama 2 / research licenseSep 30, 2023source
Qwen-14BQwen14B8GB29GB8KTongyi Qianwen LicenseSep 25, 2023source
Granite 13BIBM13B8GB27GBApache-2.0Sep 7, 2023source
Nous-Hermes-Llama2-13BNous13B8GB27GB4KLlama 2 Community LicenseJul 24, 2023source
Vicuna 13BLMSYS13B8GB27GBLLaMA Research LicenseMar 30, 2023source
Kimi-VL-A3B-Thinking-2506Moonshot16B · 3B active9GB33GB128KMITJun 21, 2025source
Kimi-VL-A3B-InstructMoonshot16B · 3B active9GB33GB128KMITApr 17, 2025source
Moonlight-16B-A3B-InstructMoonshot16B · 3B active9GB33GB8KMITFeb 24, 2025source
StarCoder2 15BBigCode16B9GB33GB16KBigCode OpenRAIL-M v1Feb 28, 2024source
DeepSeekMoE 16BDeepSeek16B · 2.8B active9GB33GB4KDeepSeek LicenseJan 11, 2024source
gpt-oss-20bOpenAI21B · 3.6B active12GB44GB128KApache-2.0Aug 5, 2025source
Codestral 22BMistral22B13GB46GB32KMistral AI Non-Production LicenseMay 29, 2024source
Magistral SmallMistral24B14GB50GB40KApache-2.0Jun 10, 2025source
Mistral Small 3.1Mistral24B14GB50GB128KApache-2.0Mar 17, 2025source
Mistral Small 3Mistral24B14GB50GB32KApache-2.0Jan 30, 2025source
Qwen3.6-27BQwen27B15GB56GB256KApache-2.0May 12, 2026source
Gemma 3 27BDeepMind27B15GB56GB128KGemma Terms of UseSep 4, 2025source
Gemma 2 27BDeepMind27B15GB56GB8KGemma Terms of UseJun 27, 2024source
Nemotron 3 Nano 30B-A3BNVIDIA30B · 3B active17GB62GB1MNemotron Open Model LicenseDec 15, 2025source
Gemma 4 31BDeepMind31B18GB64GBApache-2.0Apr 2, 2026source
OLMo 3 Think 32BAi232B18GB66GBApache-2.0Dec 15, 2025source
EXAONE 4.0 32BLG AI32B18GB66GBEXAONE AI Model LicenseJul 15, 2025source
OLMo 2 32BAi232B18GB66GB4KApache-2.0Mar 13, 2025source
EXAONE 3.5 32BLG AI32B18GB66GB32KEXAONE AI Model LicenseDec 9, 2024source
QwQ-32B-PreviewQwen32B18GB66GB32KApache-2.0Nov 28, 2024source
Qwen2.5-Coder-32BQwen32B18GB66GB128KApache-2.0Nov 12, 2024source
Falcon-H1 34BTII34B19GB70GB256KFalcon LLM License 2.0Jul 31, 2025source
Yi-1.5-34B01.AI34B19GB70GB4KYi LicenseMay 13, 2024source
Granite Code 34BIBM34B19GB70GB8KApache-2.0May 6, 2024source
Yi-34B01.AI34B19GB70GB200KYi LicenseNov 6, 2023source
DeepSeek Coder 33BDeepSeek33B19GB68GB16KDeepSeek LicenseNov 2, 2023source
Code Llama 34BMeta34B19GB70GB16KLlama 2 Community LicenseAug 24, 2023source
Aya 23 35BCohere35B20GB72GBCC-BY-NC-4.0May 23, 2024source

Larger local or small-cluster candidates

ModelLabParamsINT4 roughFP16 roughContextLicenseReleasedSource
Seed-OSS-36B-InstructSeed36B20GB74GB512KApache-2.0Aug 20, 2025source
Falcon 40BTII40B22GB82GB2KFalcon LicenseMay 25, 2023source
Phi-3.5 MoEMicrosoft42B · 6.6B active24GB87GB128KMITAug 20, 2024source
Nous Hermes 2 MixtralNous47B · 13B active26GB97GB32KApache-2.0Jan 11, 2024source
Mixtral 8x7BMistral47B · 13B active26GB97GB32KApache-2.0Dec 11, 2023source
Kimi-Linear-48B-A3B-InstructMoonshot48B · 3B active27GB99GB1.0MMITOct 31, 2025source
Llama-3.3-Nemotron-Super-49BNVIDIA49B27GB101GB128KNVIDIA Open Model LicenseApr 2, 2025source
JambaAI2152B · 12B active29GB107GB256KJamba Open Model LicenseMar 28, 2024source
LLaMAMeta65B36GB134GB2KLLaMA Research LicenseFeb 24, 2023source
DeepSeek LLM 67BDeepSeek67B37GB138GB4KDeepSeek LicenseNov 29, 2023source
Llama 3.3 70BMeta70B39GB144GB128KLlama 3.3 Community LicenseDec 6, 2024source
Llama-3.1-Nemotron-70BNVIDIA70B39GB144GB128KLlama 3.1 Community LicenseOct 15, 2024source
Llama 3 70BMeta70B39GB144GB8KLlama 3 Community LicenseApr 18, 2024source
Llama 2 70BMeta70B39GB144GB4KLlama 2 Community LicenseJul 18, 2023source
Qwen2.5-VL-72BQwen72B40GB148GB128KQwen LicenseJan 26, 2025source
Molmo 72BAi272B40GB148GBApache-2.0 / OLMo LicenseSep 25, 2024source
Qwen2.5-72BQwen72B40GB148GB128KQwen LicenseSep 19, 2024source
Qwen2-72BQwen72B40GB148GB128KQwen LicenseJun 7, 2024source
Qwen-72BQwen72B40GB148GB32KTongyi Qianwen LicenseNov 30, 2023source
Kimi-Dev-72BMoonshot73B41GB150GBMITJun 17, 2025source
Hunyuan-A13B-InstructHunyuan80B · 13B active44GB164GBTencent Hunyuan A13B LicenseApr 22, 2026source
Qwen3-Coder-NextQwen80B · 3B active44GB164GB262KApache-2.0Feb 3, 2026source
Sarvam-105BSarvam105B · 10.3B active58GB216GB128KApache-2.0Mar 6, 2026source
GLM-4.5VZ.ai106B · 12B active59GB218GBMITAug 11, 2025source
GLM-4.5-AirZ.ai106B · 12B active59GB218GB128KMITJul 28, 2025source
gpt-oss-120bOpenAI117B · 5.1B active65GB240GB128KApache-2.0Aug 5, 2025source
Nemotron 3 Super 120B-A12BNVIDIA120B · 12B active66GB246GB1MNemotron Open Model LicenseMar 16, 2026source