Fujitsu announced a breakthrough technology on September 8, 2025, that significantly "lightens" large language models (LLMs) without a substantial loss of accuracy. The technology, based on the companys proprietary Takane LLM, combines two key methods: 1-bit quantization and specialized AI "distillation." This has resulted in a 94% reduction in memory consumption and a 3x increase in inference speed, while retaining 89% of the original models accuracy. In practice, this means that an LLM that previously required a cluster of four high-performance GPUs can now run efficiently on a single low-cost GPU. This achievement paves the way for deploying complex "agentic" AI systems on edge devices such as smartphones, industrial controllers, and automotive computers, providing low latency, high privacy, and a radical reduction in energy consumption.
Fujitsu Unveils AI Reconstruction Technology Enabling LLMs to Run on a Single GPU
