Tool Icon

Mistral AI

4.1 (11 votes)
Mistral AI

Tags

LLM MoE Open-Weight Enterprise AI Code Generation

Integrations

  • Azure AI Studio
  • AWS Bedrock
  • Google Vertex AI
  • Hugging Face
  • LangChain
  • LlamaIndex

Pricing Details

  • API pricing is based on token consumption (input/output) across specific model tiers.
  • Licensing varies between Apache 2.0 and Mistral Research License (MRL) depending on model scale.

Features

  • Sparse Mixture-of-Experts (MoE) Architecture
  • 256K Context Window (Codestral series)
  • Native Function Calling & Tool Use
  • Bifurcated Licensing (Apache 2.0 / MRL)
  • VPC and On-Premise Deployment Options
  • Agentic Orchestration Support

Description

Mistral AI Architectural Assessment

Mistral AI’s 2026 infrastructure is anchored by a modular approach to Large Language Models (LLMs), primarily leveraging Sparse Mixture-of-Experts (MoE) to optimize parameter activation during runtime. This architecture enables the system to maintain a high total parameter count while significantly reducing the FLOPs required per token during inference 📑. The current model lineup, including the Mistral Large series and Codestral 2, focuses on agentic-ready cores with native support for function calling and expanded context windows 🧠.

Core Model Architecture and Reasoning

The primary architectural pattern relies on dynamic routing of input tokens to specialized sub-networks (experts), allowing for increased model capacity without a linear increase in computational cost.

  • Sparse Mixture-of-Experts (MoE): Implementation in Mistral Large and Mixtral series utilizes a router mechanism to select a subset of parameters for each token 📑. Internal routing algorithms for expert balancing remain proprietary 🌑.
  • Context Management: Support for up to 256K context windows in Codestral 2 models facilitates long-form code analysis and large-scale document ingestion 📑.
  • Agentic Capabilities: Optimization for tool use and function calling is embedded at the pre-training level to support autonomous sub-process assembly 📑.

⠠⠉⠗⠑⠁⠞⠑⠙⠀⠃⠽⠀⠠⠁⠊⠞⠕⠉⠕⠗⠑⠲⠉⠕⠍

Infrastructure and Deployment Models

Mistral AI provides a bifurcated deployment strategy: managed API services and self-hosted distributions.

  • Managed Persistence Layer: La Plateforme utilizes a proprietary storage and compute infrastructure for API-based model serving 🌑.
  • Licensing and Distribution: Models are distributed under Apache 2.0 (for specific smaller weights) or the Mistral Research License (for flagship/specialized models), allowing for local execution under specific usage constraints 📑.
  • Cloud Mediation: Deployment options include VPC-based isolation on major cloud providers to enable data residency compliance 📑.

Evaluation Guidance

Technical teams should prioritize the following validation steps:

  • MoE Concurrency Latency: Verify token-to-latency ratios under high-concurrency loads to ensure router mechanism stability 🧠.
  • Safety Mediation Documentation: Request detailed whitepapers for internal safety mediation and layered access controls, as these are not open-source 🌑.
  • Long-Context RAG Efficacy: Validate the 256K context window recall performance (e.g., Needle In A Haystack) in production RAG environments before full-scale deployment 📑.

Release History

Devstral 2 (123B) & Devstral Small 2 (24B) 2025-12-09

Release of Devstral 2, a next-generation coding model family with state-of-the-art agentic coding capabilities. Devstral 2 (123B) and Devstral Small 2 (24B) support 256K context window and are optimized for code agents.

Mistral 3 (Ministral 3B/8B/14B, Mistral Large 3) 2025-12-02

Release of Mistral 3 family: Ministral 3 (3B, 8B, 14B dense models) and Mistral Large 3 (sparse MoE, 41B active/675B total parameters). All models are open-weight, Apache 2.0 license, with multimodal and multilingual capabilities. Mistral Large 3 is the most capable model to date, optimized for enterprise and edge deployment.

API v1.1 2025-05

API update: Introduced support for fine-tuning Mistral 7B and Mixtral 8x22B models. Added streaming response option.

Mistral Large v1.1 2025-02

Mistral Large updated with enhanced multilingual capabilities and improved code generation for Python and JavaScript.

Mixtral 8x22B v0.1 2024-04-10

Release of Mixtral 8x22B, a larger and more capable Mixture-of-Experts model with 141 billion total parameters (39 billion active). Significant performance gains across various benchmarks. Retired on 2025-03-30, replaced by Mistral Small 3.2.

Mistral 7B v1.1 2024-08

Mistral 7B updated with improved instruction following and reduced hallucination rates.

API v1.0 2024-05

API update: Added support for function calling and improved rate limits.

Mistral Large v0.1 2024-02

Commercial release of Mistral Large, Mistral AI's flagship model. Superior performance in complex reasoning and coding tasks.

Mixtral 8x7B v0.1 2023-12

Release of Mixtral 8x7B, a Sparse Mixture-of-Experts model with 47 billion parameters. Improved performance over Mistral 7B.

API v0.1 2023-06

Launched API access to Mistral 7B. Initial pricing tiers available.

v0.1 2023-04

Initial release of Mistral 7B, a 7 billion parameter language model. Open-weight, Apache 2.0 license.

Tool Pros and Cons

Pros

  • High performance, small size
  • Open-weight options
  • Strong text & code
  • Fast, efficient inference
  • Good multilingual support

Cons

  • API required for commercial use
  • Potential for bias
  • API access dependent

Pricing (2026) – Mistral AI

Last updated: 23.01.2026

Free

$0 / free
  • Your personal AI assistant for life and work. Get started with our highest-performing models
  • Chat. Search. Learn. Code. Create
  • Access to Mistral's SOTA AI models
  • Save and recall up to 500 memories
  • Group chats into projects
  • Full access to Connectors directory

Pro

$14.99 / month
  • Unlock enhanced productivity with extended AI and agentic capabilities
  • Students $5.99/mo
  • More messages and web searches
  • 30x more extended thinking
  • 5x more Deep Research reports
  • Up to 15GB of document storage
  • Unlimited projects
  • Chat support

Team

$24.99 / user/month
  • Empower your team with a secure, collaborative, AI-powered workspace
  • Up to 200 flash answers /user/day
  • Up to 30GB of storage /user
  • Domain name verification
  • Data export

Enterprise

$0 / Custom
  • Audit logs
  • SAML SSO
  • White label

Mistral Large 3

$0.5 / 1M tokens
  • Open-weight, general-purpose, flagship multimodal and multilingual model
  • Text-to-text, Multimodal
  • Output (/M tokens) $1.5

Mistral Medium 3

$0.4 / 1M tokens
  • State-of-the-art performance. Simplified enterprise deployments. Cost-efficient
  • Text-to-text, Multimodal, Agentic
  • Output (/M tokens) $2

Magistral Medium

$2 / 1M tokens
  • Thinking model excelling in domain-specific, transparent, and multilingual reasoning
  • Reasoning, Text-to-text
  • Output (/M tokens) $5

Ministral 3 - 3B

$0.1 / 1M tokens
  • Best-in-class frontier AI to the edge
  • Text-to-text, Agentic, Lightweight
  • Output (/M tokens) $0.1

Ministral 3 - 8B

$0.15 / 1M tokens
  • Best-in-class frontier AI to the edge
  • Text-to-text, Agentic, Lightweight
  • Output (/M tokens) $0.15

Ministral 3 - 14B

$0.2 / 1M tokens
  • Best-in-class frontier AI to the edge
  • Text-to-text, Agentic, Lightweight
  • Output (/M tokens) $0.2

Devstral 2

$0 / free
  • Enhanced model for advanced coding agents
  • Coding, Text-to-text, Agentic
  • Output (/M tokens) free

Codestral Api

$0.3 / 1M tokens
  • Lightweight, fast, and proficient in over 80 programming languages
  • Coding, Text-to-text
  • Output (/M tokens) $0.9

Codestral Fine-Tuning

$0.2 / 1M tokens
  • Lightweight, fast, and proficient in over 80 programming languages
  • Coding, Text-to-text
  • Training Cost (/M tokens) $3
  • Storage Cost $2 / month per model
  • Input (/M tokens) $0.2
  • Output (/M tokens) $0.6

Document AI & OCR

$2 / per / 1000 pages
  • Introducing the world's best document understanding API
  • OCR, Multimodal, Text-to-text

Voxtral Mini Transcribe

$0.002 / Audio Input/min
  • State-of-the-art transcription model
  • Voice, Text-to-text

Mistral Small 3.2 Api

$0.1 / 1M tokens
  • SOTA. Multimodal. Multilingual. Apache 2.0
  • Multimodal, Lightweight, Text-to-text, Agentic
  • Output (/M tokens) $0.3

Mistral Small 3.2 Fine-Tuning

$0.1 / 1M tokens
  • SOTA. Multimodal. Multilingual. Apache 2.0
  • Multimodal, Lightweight, Text-to-text, Agentic
  • Output (/M tokens) $0.3
  • Training Cost (/M tokens) $4
  • Storage Cost $2 / month per model
  • Input (/M tokens) $0.1
  • Output (/M tokens) $0.3

Mistral Small Creative

$0.1 / 1M tokens
  • A fine-tuned small model for creative writing, roleplay, and chat—trained on curated data
  • Multimodal, Lightweight, Text-to-text
  • Output (/M tokens) $0.3

Magistral Small

$0.5 / 1M tokens
  • Thinking model excelling in domain-specific, transparent, and multilingual reasoning
  • Reasoning, Text-to-text, Lightweight
  • Output (/M tokens) $1.5

Devstral Small 2

$0 / free
  • The best open-source model for coding agents
  • Coding, Agentic, Text-to-text, Lightweight
  • Output (/M tokens) Free

Voxtral Small

$0.004 / per min / per M tok
  • State-of-the-art performance on speech and audio understanding
  • Lightweight, Voice, Text-to-text
  • Output (/M tokens) $0.3

Voxtral Mini

$0.001 / per min / per M tok
  • Low-latency speech recognition for edge and devices
  • Lightweight, Voice, Text-to-text
  • Output (/M tokens) $0.04

Classifier API model 8B

$0.1 / 1M tokens
  • Fine-tune Ministral 8B for classification tasks, like moderation, sentiment analysis, fraud detection, and more
  • Classifier APIs
  • Training Cost (/M tokens) $1
  • Storage Cost $2 / month per model
  • Input (/M tokens) $0.1
  • Output (/M tokens) $0.1

Classifier API model 3B

$0.04 / 1M tokens
  • Fine-tune Ministral 3B for classification tasks, like moderation, sentiment analysis, fraud detection, and more
  • Classifier APIs
  • Training Cost (/M tokens) $1
  • Storage Cost $2 / month per model
  • Input (/M tokens) $0.04
  • Output (/M tokens) $0.04

Mistral Moderation 24.11

$0.1 / 1M tokens
  • A classifier service for text content moderation
  • Classifier APIs

Codestral Embed

$0.15 / 1M tokens
  • State-of-the-art embedding model for code
  • Coding, Embedding

Mistral Embed

$0.1 / 1M tokens
  • State-of-the-art model for extracting representation of text extracts
  • Text-to-text, Embedding

Agent API

$0 / token
  • Enhances AI with built-in tools for code execution, web search, image generation, persistent memory, and agentic orchestration
  • Tools
  • Price Model cost per M token + Tool call

Libraries

$1 / 1M tokens
  • Upload and manage documents, enabling agents to access your external data
  • Tools
  • OCR $3/1K pages
  • Indexing $1/per M tokens
  • Call $0.01 /per call

Code execution

$30 / 1000 calls
  • Execute and interpret code snippets within the chat interface
  • Tools

Web search

$30 / 1000 calls
  • Enhance your work, research, and learning with web search, complete with citations for accurate and up-to-date information
  • Tools

Images

$100 / 1000 images
  • Generate images based on user prompts and preferences
  • Tools

Premium news

$50 / 1000 calls
  • Access to news articles via integrated news provider verification for enhanced information retrieval
  • Tools

Data capture

$0.04 / 1M tokens
  • Easily record and access API call data for debugging and continuous optimization
  • Tools

Pixtral Large

$2 / 1M tokens
  • Vision-capable large model with frontier reasoning capabilities
  • Multimodal, Text-to-text
  • Output (/M tokens) $6

Pixtral 12B Api

$0.15 / 1M tokens
  • Vision-capable small model
  • Lightweight, Multimodal, Text-to-text
  • Output (/M tokens) $0.15

Pixtral 12B Fine-Tuning

$0.15 / 1M tokens
  • Vision-capable small model
  • Lightweight, Multimodal, Text-to-text
  • Training Cost (/M tokens) $2
  • Storage Cost $2
  • Input (/M tokens) $0.15
  • Output (/M tokens) $0.15

Mistral NeMo Api

$0.15 / 1M tokens
  • State-of-the-art Mistral model trained specifically for code tasks
  • Coding, Lightweight
  • Output (/M tokens) $0.15

Mistral NeMo Fine-Tuning

$0.15 / 1M tokens
  • State-of-the-art Mistral model trained specifically for code tasks
  • Coding, Lightweight
  • Training Cost (/M tokens) $1
  • Storage Cost $2
  • Input (/M tokens) $0.15
  • Output (/M tokens) $0.15

Mistral 7B

$0.25 / 1M tokens
  • A 7B transformer model, fast-deployed and easily customisable
  • Text-to-text, Lightweight
  • Output (/M tokens) $0.25

Mixtral 8x7B

$0.7 / 1M tokens
  • A 7B sparse Mixture-of-Experts (SMoE). Uses 12.9B active parameters out of 45B total
  • Text-to-text
  • Output (/M tokens) $0.7

Mixtral 8x22B

$2 / 1M tokens
  • Mixtral 8x22B is currently the most performant open model. A 22B sparse Mixture-of-Experts (SMoE). Uses only 39B active parameters out of 141B
  • Text-to-text
  • Output (/M tokens) $6
Chat