tech

December 4, 2025

The NPU in your phone keeps improving—why isn’t that making AI better?

Almost every technological innovation of the past several years has been laser-focused on one thing: generative AI. Many of these supposedly revolutionary systems run on big, expensive servers in a data center somewhere, but at the same time, chipmakers are crowing about the power of the neural processing units (NPU) they have brought to consumer devices. Every few months, it’s the same thing: This new NPU is 30 or 40 percent faster than the last one. That’s supposed to let you do something important, but no one really gets around to explaining what that is.

The NPU in your phone keeps improving—why isn’t that making AI better?

TL;DR

  • Generative AI innovation is heavily focused on NPUs in consumer devices, but most advanced AI still operates in the cloud.
  • NPUs are specialized processors within Systems-on-a-Chip (SoCs), building on the architecture of Digital Signal Processors (DSPs) for parallel computing.
  • Cloud-based AI models, such as full versions of Gemini and ChatGPT, have significantly larger context windows and parameter counts than on-device models.
  • Running AI models on edge devices requires significant compromises in model size and precision due to hardware limitations.
  • On-device AI processing offers potential privacy benefits by keeping personal data local, away from cloud data centers.
  • Cloud AI services are vulnerable to internet outages and third-party data breaches, while edge AI offers greater reliability.
  • Despite advancements in NPU technology, the trend leans towards cloud dominance for AI processing due to its resource advantages.
  • A hybrid approach to AI processing is seen as necessary, but the seamless integration can obscure data handling practices.
  • Manufacturers are investing in increased RAM capacity on devices to better support on-device AI capabilities.

Continue reading
the original article

Made withNostr