top of page

Production models for agents, coding, MULTIMODAL, and real workloads

Access high-performance open-source frontier models through Zygma’s inference layer, with routing, deployment, and optimization 

Models available on Zygma

VxHk9HyU_400x400.jpg

Ministral 3 3B 2512

Text | 3B | 128k ctx

VxHk9HyU_400x400.jpg

Qwen3.5 Plus 2026-02-15

Text | 131k ctx

VxHk9HyU_400x400.jpg

MiniMax M2.5

Text | 1024k ctx

VxHk9HyU_400x400.jpg

GLM 5

Text | 128k ctx

VxHk9HyU_400x400.jpg

Kimi K2.5

Text | 131k ctx

VxHk9HyU_400x400.jpg

Qwen3 Max Thinking

Reasoning | 131k ctx

VxHk9HyU_400x400.jpg

LFM2-8B-A1B

Text | 8B | 128k ctx

VxHk9HyU_400x400.jpg

Mistral Small

Text | 6B | 262k ctx

VxHk9HyU_400x400.jpg

Nemotron 3 Super 120b

Text | 120B | 262k ctx

VxHk9HyU_400x400.jpg

Grok 4.1 Fast

Text | 131k ctx

VxHk9HyU_400x400.jpg

DeepSeek V3.2

Text | 685B | 131k ctx

VxHk9HyU_400x400.jpg

Mixtral 8x7B Instruct

Text | 46.7B | 33k ctx

VxHk9HyU_400x400.jpg

Ministral 3 8B 2512

Text | 8B | 128k ctx

VxHk9HyU_400x400.jpg

Llama 3.2 11B Vision Instruct

Multimodal | 11B | 131k ctx

VxHk9HyU_400x400.jpg

Mistral: Pixtral Large 2411

Multimodal | 124B | 128k ctx

VxHk9HyU_400x400.jpg

GPT OSS 20B

Text | 20B | 128k ctx

VxHk9HyU_400x400.jpg

Gemma 3 27B

Text | 27B | 128k ctx

VxHk9HyU_400x400.jpg

Qwen2.5 VL 32B Instruct

Multimodal | 32B | 8k ctx

VxHk9HyU_400x400.jpg

Llama 3.3 70B Instruct

Text | 70B | 131k ctx

VxHk9HyU_400x400.jpg

Qwen3 VL 32B Instruct

Multimodal | 32B | 8k ctx

Built for production AI teams

Agents

Support planning, tool use, retrieval, and execution across multi-step workflows.

Coding

Power developer assistants, code generation systems, and engineering copilots.

Enterprise AI

Deploy internal assistants, document workflows, and structured-output systems.

Multimodal 

Build workflows that combine text, vision, and structured extraction.

bottom of page