Home > Categories > Generative AI > Text Generation > Mistral AI

Mistral AI

Related Capabilities / Limitations Pricing

Tags

LLM MoE Open-Weight Enterprise AI Code Generation

Integrations

Azure AI Studio
AWS Bedrock
Google Vertex AI
Hugging Face
LangChain
LlamaIndex

Categories:
Generative AI Machine learning and neural networks Natural language processing Personal AI assistants Software Development
Creator Mistral AI
Date 2023
Platforms Web, API, Frameworks, Cloud Platforms
Status Active
Website mistral.ai
Price Model Free (Open Weight Models) / Pay-as-you-go
Sections:
Chatbots and Conversational AI Code Generation Model Training Text Assistants Text Generation

Pricing Details

API pricing is based on token consumption (input/output) across specific model tiers.
Licensing varies between Apache 2.0 and Mistral Research License (MRL) depending on model scale.

Official Site Pricing Docs API Ref

Useful Resources

Features

Sparse Mixture-of-Experts (MoE) Architecture
256K Context Window (Codestral series)
Native Function Calling & Tool Use
Bifurcated Licensing (Apache 2.0 / MRL)
VPC and On-Premise Deployment Options
Agentic Orchestration Support

Description

Mistral AI Architectural Assessment

Mistral AI’s 2026 infrastructure is anchored by a modular approach to Large Language Models (LLMs), primarily leveraging Sparse Mixture-of-Experts (MoE) to optimize parameter activation during runtime. This architecture enables the system to maintain a high total parameter count while significantly reducing the FLOPs required per token during inference 📑. The current model lineup, including the Mistral Large series and Codestral 2, focuses on agentic-ready cores with native support for function calling and expanded context windows 🧠.

Core Model Architecture and Reasoning

The primary architectural pattern relies on dynamic routing of input tokens to specialized sub-networks (experts), allowing for increased model capacity without a linear increase in computational cost.

Sparse Mixture-of-Experts (MoE): Implementation in Mistral Large and Mixtral series utilizes a router mechanism to select a subset of parameters for each token 📑. Internal routing algorithms for expert balancing remain proprietary 🌑.
Context Management: Support for up to 256K context windows in Codestral 2 models facilitates long-form code analysis and large-scale document ingestion 📑.
Agentic Capabilities: Optimization for tool use and function calling is embedded at the pre-training level to support autonomous sub-process assembly 📑.

⠠⠉⠗⠑⠁⠞⠑⠙⠀⠃⠽⠀⠠⠁⠊⠞⠕⠉⠕⠗⠑⠲⠉⠕⠍

Infrastructure and Deployment Models

Mistral AI provides a bifurcated deployment strategy: managed API services and self-hosted distributions.

Managed Persistence Layer: La Plateforme utilizes a proprietary storage and compute infrastructure for API-based model serving 🌑.
Licensing and Distribution: Models are distributed under Apache 2.0 (for specific smaller weights) or the Mistral Research License (for flagship/specialized models), allowing for local execution under specific usage constraints 📑.
Cloud Mediation: Deployment options include VPC-based isolation on major cloud providers to enable data residency compliance 📑.

Evaluation Guidance

Technical teams should prioritize the following validation steps:

MoE Concurrency Latency: Verify token-to-latency ratios under high-concurrency loads to ensure router mechanism stability 🧠.
Safety Mediation Documentation: Request detailed whitepapers for internal safety mediation and layered access controls, as these are not open-source 🌑.
Long-Context RAG Efficacy: Validate the 256K context window recall performance (e.g., Needle In A Haystack) in production RAG environments before full-scale deployment 📑.

Release History

Devstral 2 (123B) & Devstral Small 2 (24B) 2025-12-09

Release of Devstral 2, a next-generation coding model family with state-of-the-art agentic coding capabilities. Devstral 2 (123B) and Devstral Small 2 (24B) support 256K context window and are optimized for code agents.

Mistral 3 (Ministral 3B/8B/14B, Mistral Large 3) 2025-12-02

Release of Mistral 3 family: Ministral 3 (3B, 8B, 14B dense models) and Mistral Large 3 (sparse MoE, 41B active/675B total parameters). All models are open-weight, Apache 2.0 license, with multimodal and multilingual capabilities. Mistral Large 3 is the most capable model to date, optimized for enterprise and edge deployment.

API v1.1 2025-05

API update: Introduced support for fine-tuning Mistral 7B and Mixtral 8x22B models. Added streaming response option.

Mistral Large v1.1 2025-02

Mistral Large updated with enhanced multilingual capabilities and improved code generation for Python and JavaScript.

Mixtral 8x22B v0.1 2024-04-10

Release of Mixtral 8x22B, a larger and more capable Mixture-of-Experts model with 141 billion total parameters (39 billion active). Significant performance gains across various benchmarks. Retired on 2025-03-30, replaced by Mistral Small 3.2.

Mistral 7B v1.1 2024-08

Mistral 7B updated with improved instruction following and reduced hallucination rates.

API v1.0 2024-05

API update: Added support for function calling and improved rate limits.

Mistral Large v0.1 2024-02

Commercial release of Mistral Large, Mistral AI's flagship model. Superior performance in complex reasoning and coding tasks.

Mixtral 8x7B v0.1 2023-12

Release of Mixtral 8x7B, a Sparse Mixture-of-Experts model with 47 billion parameters. Improved performance over Mistral 7B.

API v0.1 2023-06

Launched API access to Mistral 7B. Initial pricing tiers available.

v0.1 2023-04

Initial release of Mistral 7B, a 7 billion parameter language model. Open-weight, Apache 2.0 license.

Tool Pros and Cons

Pros

High performance, small size
Open-weight options
Strong text & code
Fast, efficient inference
Good multilingual support

Cons

API required for commercial use
Potential for bias
API access dependent

Pricing (2026) – Mistral AI

Last updated: 23.01.2026

Free

$0 / free

Your personal AI assistant for life and work. Get started with our highest-performing models
Chat. Search. Learn. Code. Create
Access to Mistral's SOTA AI models
Save and recall up to 500 memories
Group chats into projects
Full access to Connectors directory

Pro

$14.99 / month

Unlock enhanced productivity with extended AI and agentic capabilities
Students $5.99/mo
More messages and web searches
30x more extended thinking
5x more Deep Research reports
Up to 15GB of document storage
Unlimited projects
Chat support

Team

$24.99 / user/month

Empower your team with a secure, collaborative, AI-powered workspace
Up to 200 flash answers /user/day
Up to 30GB of storage /user
Domain name verification
Data export

Enterprise

$0 / Custom

Audit logs
SAML SSO
White label

Mistral Large 3

$0.5 / 1M tokens

Open-weight, general-purpose, flagship multimodal and multilingual model
Text-to-text, Multimodal
Output (/M tokens) $1.5

Mistral Medium 3

$0.4 / 1M tokens

State-of-the-art performance. Simplified enterprise deployments. Cost-efficient
Text-to-text, Multimodal, Agentic
Output (/M tokens) $2

Magistral Medium

$2 / 1M tokens

Thinking model excelling in domain-specific, transparent, and multilingual reasoning
Reasoning, Text-to-text
Output (/M tokens) $5

Ministral 3 - 3B

$0.1 / 1M tokens

Best-in-class frontier AI to the edge
Text-to-text, Agentic, Lightweight
Output (/M tokens) $0.1

Ministral 3 - 8B

$0.15 / 1M tokens

Best-in-class frontier AI to the edge
Text-to-text, Agentic, Lightweight
Output (/M tokens) $0.15

Ministral 3 - 14B

$0.2 / 1M tokens

Best-in-class frontier AI to the edge
Text-to-text, Agentic, Lightweight
Output (/M tokens) $0.2

Devstral 2

$0 / free

Enhanced model for advanced coding agents
Coding, Text-to-text, Agentic
Output (/M tokens) free

Codestral Api

$0.3 / 1M tokens

Lightweight, fast, and proficient in over 80 programming languages
Coding, Text-to-text
Output (/M tokens) $0.9

Codestral Fine-Tuning

$0.2 / 1M tokens

Lightweight, fast, and proficient in over 80 programming languages
Coding, Text-to-text
Training Cost (/M tokens) $3
Storage Cost $2 / month per model
Input (/M tokens) $0.2
Output (/M tokens) $0.6

Document AI & OCR

$2 / per / 1000 pages

Introducing the world's best document understanding API
OCR, Multimodal, Text-to-text

Voxtral Mini Transcribe

$0.002 / Audio Input/min

State-of-the-art transcription model
Voice, Text-to-text

Mistral Small 3.2 Api

$0.1 / 1M tokens

SOTA. Multimodal. Multilingual. Apache 2.0
Multimodal, Lightweight, Text-to-text, Agentic
Output (/M tokens) $0.3

Mistral Small 3.2 Fine-Tuning

$0.1 / 1M tokens

SOTA. Multimodal. Multilingual. Apache 2.0
Multimodal, Lightweight, Text-to-text, Agentic
Output (/M tokens) $0.3
Training Cost (/M tokens) $4
Storage Cost $2 / month per model
Input (/M tokens) $0.1
Output (/M tokens) $0.3

Mistral Small Creative

$0.1 / 1M tokens

A fine-tuned small model for creative writing, roleplay, and chat—trained on curated data
Multimodal, Lightweight, Text-to-text
Output (/M tokens) $0.3

Magistral Small

$0.5 / 1M tokens

Thinking model excelling in domain-specific, transparent, and multilingual reasoning
Reasoning, Text-to-text, Lightweight
Output (/M tokens) $1.5

Devstral Small 2

$0 / free

The best open-source model for coding agents
Coding, Agentic, Text-to-text, Lightweight
Output (/M tokens) Free

Voxtral Small

$0.004 / per min / per M tok

State-of-the-art performance on speech and audio understanding
Lightweight, Voice, Text-to-text
Output (/M tokens) $0.3

Voxtral Mini

$0.001 / per min / per M tok

Low-latency speech recognition for edge and devices
Lightweight, Voice, Text-to-text
Output (/M tokens) $0.04

Classifier API model 8B

$0.1 / 1M tokens

Fine-tune Ministral 8B for classification tasks, like moderation, sentiment analysis, fraud detection, and more
Classifier APIs
Training Cost (/M tokens) $1
Storage Cost $2 / month per model
Input (/M tokens) $0.1
Output (/M tokens) $0.1

Classifier API model 3B

$0.04 / 1M tokens

Fine-tune Ministral 3B for classification tasks, like moderation, sentiment analysis, fraud detection, and more
Classifier APIs
Training Cost (/M tokens) $1
Storage Cost $2 / month per model
Input (/M tokens) $0.04
Output (/M tokens) $0.04

Mistral Moderation 24.11

$0.1 / 1M tokens

A classifier service for text content moderation
Classifier APIs

Codestral Embed

$0.15 / 1M tokens

State-of-the-art embedding model for code
Coding, Embedding

Mistral Embed

$0.1 / 1M tokens

State-of-the-art model for extracting representation of text extracts
Text-to-text, Embedding

Agent API

$0 / token

Enhances AI with built-in tools for code execution, web search, image generation, persistent memory, and agentic orchestration
Tools
Price Model cost per M token + Tool call

Libraries

$1 / 1M tokens

Upload and manage documents, enabling agents to access your external data
Tools
OCR $3/1K pages
Indexing $1/per M tokens
Call $0.01 /per call

Code execution

$30 / 1000 calls

Execute and interpret code snippets within the chat interface
Tools

Web search

$30 / 1000 calls

Enhance your work, research, and learning with web search, complete with citations for accurate and up-to-date information
Tools

Images

$100 / 1000 images

Generate images based on user prompts and preferences
Tools

Premium news

$50 / 1000 calls

Access to news articles via integrated news provider verification for enhanced information retrieval
Tools

Data capture

$0.04 / 1M tokens

Easily record and access API call data for debugging and continuous optimization
Tools

Pixtral Large

$2 / 1M tokens

Vision-capable large model with frontier reasoning capabilities
Multimodal, Text-to-text
Output (/M tokens) $6

Pixtral 12B Api

$0.15 / 1M tokens

Vision-capable small model
Lightweight, Multimodal, Text-to-text
Output (/M tokens) $0.15

Pixtral 12B Fine-Tuning

$0.15 / 1M tokens

Vision-capable small model
Lightweight, Multimodal, Text-to-text
Training Cost (/M tokens) $2
Storage Cost $2
Input (/M tokens) $0.15
Output (/M tokens) $0.15

Mistral NeMo Api

$0.15 / 1M tokens

State-of-the-art Mistral model trained specifically for code tasks
Coding, Lightweight
Output (/M tokens) $0.15

Mistral NeMo Fine-Tuning

$0.15 / 1M tokens

State-of-the-art Mistral model trained specifically for code tasks
Coding, Lightweight
Training Cost (/M tokens) $1
Storage Cost $2
Input (/M tokens) $0.15
Output (/M tokens) $0.15

Mistral 7B

$0.25 / 1M tokens

A 7B transformer model, fast-deployed and easily customisable
Text-to-text, Lightweight
Output (/M tokens) $0.25

Mixtral 8x7B

$0.7 / 1M tokens

A 7B sparse Mixture-of-Experts (SMoE). Uses 12.9B active parameters out of 45B total
Text-to-text
Output (/M tokens) $0.7

Mixtral 8x22B

$2 / 1M tokens

Mixtral 8x22B is currently the most performant open model. A 22B sparse Mixture-of-Experts (SMoE). Uses only 39B active parameters out of 141B
Text-to-text
Output (/M tokens) $6

Mistral AI

Tags

Integrations

Pricing Details

Useful Resources

Features

Description

Mistral AI Architectural Assessment

Core Model Architecture and Reasoning

Infrastructure and Deployment Models

Evaluation Guidance

Release History

Tool Pros and Cons

Pros

Cons

Pricing (2026) – Mistral AI

Free

Pro

Team

Enterprise

Mistral Large 3

Mistral Medium 3

Magistral Medium

Ministral 3 - 3B

Ministral 3 - 8B

Ministral 3 - 14B

Devstral 2

Codestral Api

Codestral Fine-Tuning

Document AI & OCR

Voxtral Mini Transcribe

Mistral Small 3.2 Api

Mistral Small 3.2 Fine-Tuning

Mistral Small Creative

Magistral Small

Devstral Small 2

Voxtral Small

Voxtral Mini

Classifier API model 8B

Classifier API model 3B

Mistral Moderation 24.11

Codestral Embed

Mistral Embed

Agent API

Libraries

Code execution

Web search

Images

Premium news

Data capture

Pixtral Large

Pixtral 12B Api

Pixtral 12B Fine-Tuning

Mistral NeMo Api

Mistral NeMo Fine-Tuning

Mistral 7B

Mixtral 8x7B

Mixtral 8x22B

Related Tools You Might Find Useful

Llama 3

Gemini

DeepSeek

Claude

ChatGPT

Qwen

Report an error