Skip to content

Commit 1baf1e1

Browse files
authored
[Workers AI] Add NVIDIA Nemotron 3 Super model, changelog, release notes, and pricing (#28922)
1 parent 094ebb7 commit 1baf1e1

4 files changed

Lines changed: 504 additions & 0 deletions

File tree

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
---
2+
title: "NVIDIA Nemotron 3 Super now available on Workers AI"
3+
description: A hybrid MoE model with 120B total parameters and 12B active, optimized for multi-agent and agentic AI workloads.
4+
products:
5+
- workers-ai
6+
date: 2026-03-11
7+
---
8+
9+
We're excited to partner with NVIDIA to bring [`@cf/nvidia/nemotron-3-120b-a12b`](/workers-ai/models/nemotron-3-120b-a12b/) to Workers AI. NVIDIA Nemotron 3 Super is a Mixture-of-Experts (MoE) model with a hybrid Mamba-transformer architecture, 120B total parameters, and 12B active parameters per forward pass.
10+
11+
The model is optimized for running many collaborating agents per application. It delivers high accuracy for reasoning, tool calling, and instruction following across complex multi-step tasks.
12+
13+
**Key capabilities:**
14+
15+
- **Hybrid Mamba-transformer architecture** delivers over 50% higher token generation throughput compared to leading open models, reducing latency for real-world applications
16+
- **Tool calling** support for building AI agents that invoke tools across multiple conversation turns
17+
- **Multi-Token Prediction (MTP)** accelerates long-form text generation by predicting several future tokens simultaneously in a single forward pass
18+
- **32,000 token context window** for retaining conversation history and plan states across multi-step agent workflows
19+
20+
Use Nemotron 3 Super through the [Workers AI binding](/workers-ai/configuration/bindings/) (`env.AI.run()`), the REST API, or the [OpenAI-compatible endpoint](/workers-ai/configuration/open-ai-compatibility/).
21+
22+
For more information, refer to the [Nemotron 3 Super model page](/workers-ai/models/nemotron-3-120b-a12b/).

src/content/docs/workers-ai/platform/pricing.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ The Price in Tokens column is equivalent to the Price in Neurons column - the di
6060
| @cf/aisingapore/gemma-sea-lion-v4-27b-it | $0.351 per M input tokens <br/> $0.555 per M output tokens | 31876 neurons per M input tokens <br/> 50488 neurons per M output tokens |
6161
| @cf/ibm-granite/granite-4.0-h-micro | $0.017 per M input tokens <br/> $0.112 per M output tokens | 1542 neurons per M input tokens <br/> 10158 neurons per M output tokens |
6262
| @cf/zai-org/glm-4.7-flash | $0.060 per M input tokens <br/> $0.400 per M output tokens | 5500 neurons per M input tokens <br/> 36400 neurons per M output tokens |
63+
| @cf/nvidia/nemotron-3-120b-a12b | $0.500 per M input tokens <br/> $1.500 per M output tokens | 45455 neurons per M input tokens <br/> 136364 neurons per M output tokens |
6364

6465
## Embeddings model pricing
6566

src/content/release-notes/workers-ai.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@ link: "/workers-ai/changelog/"
33
productName: Workers AI
44
productLink: "/workers-ai/"
55
entries:
6+
- publish_date: "2026-03-11"
7+
title: NVIDIA Nemotron 3 Super now available on Workers AI
8+
description: |-
9+
- [`@cf/nvidia/nemotron-3-120b-a12b`](/workers-ai/models/nemotron-3-120b-a12b/) now available on Workers AI! A hybrid MoE model with 120B total parameters and 12B active, optimized for multi-agent and agentic AI workloads. Read [changelog](/changelog/post/2026-03-11-nemotron-3-super-workers-ai/) to get started.
610
- publish_date: "2026-03-06"
711
title: Deepgram Nova-3 now supports 10 languages with regional variants
812
description: |-

0 commit comments

Comments
 (0)