Why hasn't there been a big winner in edge AI chips?

Cheaper chips may be a harder sell.

Mar 13, 2024

Making powerful chips -- the kinds of chips that power cloud-hosted AI models like GPT-4 -- requires a big capital investment. There’s a reason why so many AI chip startups have to raise boatloads of cash and still often struggle. Getting a state-of-the-art chip manufactured can cost tens if not hundreds of millions of dollars split between design tools, IP, mask sets, and the wafers themselves. This often leaves little room for failure. Stumble once or twice, and you’re toast.

On the other hand, there are many, many devices that don’t require state-of-the-art chips. Most chips that get manufactured are built using older process nodes, like 65nm, 28nm, and 14nm, and for good reasons. These mature node designs are more affordable and also consume less power; having larger transistors means that less current leaks through transistors that are supposed to be off. Mature node chips are often found in edge devices, like thermostats, heart rate monitors, and industrial monitoring equipment. This is in contrast to the cloud, which is the catch-all term for powerful datacenters filled with cutting-edge chips.

There are many, many companies trying to use mature process nodes to build power-efficient, affordable AI chips for edge devices, but you may not have heard of many of them. These edge AI chip companies -- Syntiant, Perceive, Axelera -- aren’t as big names as their cloud counterparts -- Cerebras, SambaNova, Groq. Why? Shouldn’t these edge AI chip companies be having a much easier time, with lower costs and simpler chips?

The answer is a bit surprising: even though edge AI chips may be cheaper to build, it turns out that there are unique challenges that make them harder to sell.

Models keep growing and changing

When you sell an edge device, it usually stays in the field for years. Think of a security camera, or a piece of industrial monitoring equipment, or a smart home speaker. But in the world of AI, years move fast. For example, we went from GPT-2 to GPT-4 in 4 years, with the parameter count growing from 1.5 billion to over 1 trillion. Another example: convolutional neural networks, once the state-of-the-art in computer vision, are being replaced by Visual Transformers. If you shipped an edge device in 2019 that was designed to perform computer vision tasks using small convolutional neural networks, your products are getting left in the dust.

This challenge is exacerbated by the fact that most edge AI chips get efficiency through specialization. They don’t support any arbitrary workload; instead, by focusing on a small number of key models, they can be more efficient than a general purpose processor. But if models keep changing, these specialized edge AI chips will keep going out of date quickly. And even though they’re less costly to develop than cloud AI chips, having to redesign your chips over and over to support new models isn’t cheap!

Syntiant, one of the few commercially successful startups in the edge AI space, offers 6 chips. 3 of them only support fully connected neural networks, and even their more powerful chips still have limited support for different neural network architectures. It’s unclear how evolving neural network architectures will affect their sales, or if they’re planning to launch any new chips to keep up.

But that’s not the only challenge if you’re trying to take a shot at building edge AI chips. You also have to contend with everybody else who’s aiming for the same market.

Lots of competition

I’m not the first person to notice that edge AI chips are faster, easier, and more affordable to design and build than their cloud counterparts. Many of semiconductor founders noticed the same thing, and started companies building these sorts of chips. And when I say a lot, I mean a lot. The Linley Group’s 2021 report on deep learning processors includes over 60 vendors, with a significant majority focused on edge devices.

As Mike Delmer said in ZDNet, “It's hard to come up with something that's truly different. [...] I see these presentations, 'world's first,' or, 'world's best,' and I say, yeah, no, we've seen dozens." And that’s not even mentioning Nvidia’s Jetson or Google’s Edge TPU! There are a good number of companies out there gunning for this market, which might only have room for one or two victors in each vertical.

But the competition gets even tougher -- oftentimes, edge AI chips are also indirectly competing with cloud AI solutions.

Competing with the cloud

If you’re building a hardware product -- for example, a smart home speaker -- and you want to add AI features to it, you have two routes you could take to make that happen. You could either run the machine learning model on the device itself, using an AI chip, or you could send all of the data to the cloud and run the model there.

There are pros and cons to each method. If you’re running the model of the device itself, its AI features will work even if the device is offline. If the data you’re processing is private, keeping it on the device helps keep users’ data secure. And if you care about your device responding as quickly as possible, processing data on the device eliminates delays from sending that data over the internet.

But processing that data in the cloud offers significantly more powerful models, which translates to more accurate responses in our hypothetical smart home speaker. It’s also easier to update your models if they’re hosted in the cloud rather than on separate individual devices scattered across the world.

Usually, device makers go with a hybrid approach, where more lightweight machine learning features (like voice recognition) happen on the device, and more complex features (like responding to a user query) happen on the cloud. Unfortunately for companies trying to make better edge AI chips, the complex features are the main reasons these products are valuable!

Edge AI chips definitely have a home in products with significant privacy concerns uploading data to the cloud, like medical devices. They fit well into devices where you need ultra-low-latency, like hearing aids. And they make sense for devices that might have poor network connections. But for most devices, the most valuable AI features are actually best delivered using models hosted in the cloud.

No killer apps

Let’s keep discussing smart home speakers. To most people, that’s the most common “edge AI device” they’ve come across. But sales of smart home devices are stagnating and the entire idea is being written off as a failure. Sure, there are other use cases for edge AI in industrial monitoring, medical systems, and security systems. But none of those markets are particularly huge, nor do they desperately need an AI solution.

On the other hand, let’s look at cloud-hosted AI platforms. Deep-learning based recommender models have been used for years by companies like Meta and Google. These models rake in hundreds of billions of dollars targeting ads using AI. And these models were enough of a killer app on their own that they spawned an entire crop of cloud AI chip companies between 2015 and 2020. Now, generative AI is adding a second killer app for cloud AI chips, with ChatGPT becoming the fastest growing consumer application in history.

The Verdict

At first glance, edge AI chips seems like a better business than cloud AI chips: edge chips, manufactured in mature nodes, are cheaper to make, and they have lower performance requirements. But often, edge chips end up struggling to support up-to-date models, and have to compete against both a crowded field of edge AI chip startups and also against cloud-based solutions. And there’s one thing that cloud AI chips have that make them so much more attractive for both entrepreneurs and for large companies: a massive, clear, and growing market.

zach's tech blog

Discussion about this post