Production models for agents, coding, MULTIMODAL, and real workloads

Access high-performance open-source frontier models through Zygma’s inference layer, with routing, deployment, and optimization

Launch Console

Models available on Zygma

Ministral 3 3B 2512

Text | 3B | 128k ctx

Qwen3.5 Plus 2026-02-15

Text | 131k ctx

MiniMax M2.5

Text | 1024k ctx

GLM 5

Text | 128k ctx

Kimi K2.5

Text | 131k ctx

Qwen3 Max Thinking

Reasoning | 131k ctx

LFM2-8B-A1B

Text | 8B | 128k ctx

Mistral Small

Text | 6B | 262k ctx

Nemotron 3 Super 120b

Text | 120B | 262k ctx

Grok 4.1 Fast

Text | 131k ctx

DeepSeek V3.2

Text | 685B | 131k ctx

Mixtral 8x7B Instruct

Text | 46.7B | 33k ctx

Ministral 3 8B 2512

Text | 8B | 128k ctx

Llama 3.2 11B Vision Instruct

Multimodal | 11B | 131k ctx

Mistral: Pixtral Large 2411

Multimodal | 124B | 128k ctx

GPT OSS 20B

Text | 20B | 128k ctx

Gemma 3 27B

Text | 27B | 128k ctx

Qwen2.5 VL 32B Instruct

Multimodal | 32B | 8k ctx

Llama 3.3 70B Instruct

Text | 70B | 131k ctx

Qwen3 VL 32B Instruct

Multimodal | 32B | 8k ctx

Built for production AI teams

Agents

Support planning, tool use, retrieval, and execution across multi-step workflows.

Coding

Power developer assistants, code generation systems, and engineering copilots.

Enterprise AI

Deploy internal assistants, document workflows, and structured-output systems.

Multimodal

Build workflows that combine text, vision, and structured extraction.