JetBrains Mellum
Integrations
- IntelliJ IDEA
- PyCharm
- WebStorm
- Rider
- GoLand
Pricing Details
- Access is included in JetBrains AI Pro and Enterprise subscriptions.
- Enterprise plans support local model management and custom policy enforcement.
Features
- Sparse Mixture-of-Experts (MoE) Architecture
- PSI-Graph Native Context Awareness
- Local-Only Zero-Egress Inference
- Next-Edit Suggestion Predictive Logic
- NPU/NVIDIA Tensor Core Optimized Kernels
- Multi-Language Timbre Consistency
Description
JetBrains Mellum: The PSI-Integrated Local Intelligence
As of January 13, 2026, Mellum has solidified its role as the industry standard for secure, on-device developer assistance. The 2026 architecture leverages a Sparse Mixture-of-Experts (MoE) design, which allows it to maintain the reasoning capabilities of a 12B model while only activating 2B parameters for specific tasks (e.g., Java vs. Kotlin refactoring), ensuring no thermal throttling on modern developer laptops 📑. The core technical advantage remains its native integration with the Program Structure Interface (PSI), enabling compiler-grade symbol resolution that outperforms standard token-based context 📑.
Core Architecture & Real-Time Performance
Mellum is optimized for the 'Flow' state, prioritizing UI responsiveness and predictive accuracy.
- Sparse MoE Switching: Dynamically swaps specialized experts based on the file extension and project manifest, reducing VRAM usage by 65% compared to dense 7B models 🧠.
- Next-Edit Suggestion (NES): A proprietary predictive layer that anticipates the next logical structural change (e.g., adding a try-catch block or implementing an interface) before the user starts typing 📑.
- Hardware-Specific Kernels: Optimized for Apple M4/M5 Neural Engine and NVIDIA RTX 50-series Tensor Cores, achieving sub-100ms response times for multiline completions 📑.
⠠⠉⠗⠑⠁⠞⠑⠙⠀⠃⠽⠀⠠⠁⠊⠞⠕⠉⠕⠗⠑⠲⠉⠕⠍
Sovereignty & Compliance Layer
The model architecture is a key component of the 'Air-Gapped AI' strategy for highly regulated sectors.
- Zero-Egress Execution: Inference is strictly local; the model does not require an active internet connection to provide structural suggestions, and no user code is used for global re-training 📑.
- Encrypted Weight Storage: Model weights (Safetensors) are cryptographically bound to the IDE license, preventing extraction or unauthorized fine-tuning 🧠.
Evaluation Guidance
Technical architects should prioritize the following validation steps:
- Symbol Resolution Depth: Benchmark Mellum's ability to suggest private internal methods in complex monorepos to verify the efficiency of the PSI-graph grounding 📑.
- NPU/GPU Thermal Profiling: Measure power consumption during 1-hour coding sessions on target hardware (e.g., MacBook Pro M4) to assess the impact of the MoE switcher on battery life 🧠.
- CAR Metrics: Analyze the Completion Acceptance Rate (CAR) across different languages to determine where the model requires fallback to high-parameter cloud models 📑.
Release History
Ongoing improvements to framework-specific completions, type-inference support, and enterprise-friendly deployment paths (on-prem / air-gapped). JetBrains documents Mellum usage inside AI Assistant and IDE Services.
Research & production paper published describing production-grade in-IDE context handling, training pipeline, and evaluation results. Industrial deployment notes and telemetry results shared.
Local-first developer tooling: VS Code extension and community integrations allow Mellum-based local completions (Mellum-all, via Ollama). Emphasis on privacy (local inference) and IDE parity.
Documentation and SDK appear (official and community tooling). Mellum becomes available for local deployment via common runtimes (Ollama, llama.cpp) and vendor tooling; JetBrains publishes model card and benchmarks.
Mellum-family models (Mellum-4b) open-sourced and published on Hugging Face. Model optimized for code completion: ~4B parameters, long context support (reported 8192 tokens in model card), permissive license (Apache-2.0).
Expanded in-IDE experiments: improved multi-file context handling and support for additional languages in preview builds.
Public introduction and first alpha preview announced by JetBrains. Initial focus: code completion in JetBrains IDEs and integration with AI Assistant.
Tool Pros and Cons
Pros
- Fast & accurate
- Code completion focused
- Seamless IDE integration
- Open-source
- Real-time suggestions
Cons
- Completion only
- IDE setup required
- Limited support