Doubao, the Assistant That Runs Your Phone for You

An AI agent that opens your apps, compares prices and buys for you: the icon grid is receding. What you gain, and what you quietly hand over.

In January 2026, a quiet phone sold out in hours. The nubia M153, built by ZTE and priced at 3,499 yuan, about 470 euros, in a run of thirty thousand units reserved for developers. Its distinguishing feature was neither its screen nor its camera, but the software running it: Doubao, the conversational agent from ByteDance, TikTok's parent company, built in at the operating-system level. On this device, you no longer tap on icons. You speak, and the agent opens the apps for you.

This Chinese prototype is not an isolated case. On June 16, Qualcomm chief executive Cristiano Amon said that AI agents would eventually replace apps as the phone's primary interface, and revealed that his company was working on more than forty new devices designed around the idea. He called 2026 "the year of agents." Behind the phrase lies a precise promise: that the grid of icons, the checkerboard we scroll through dozens of times a day, gives way to a single thing that understands what you want and does it.

From the icon grid to intention

The principle breaks with fifteen years of habit. Since the iPhone, using a phone has meant choosing an app, opening it, navigating its menus, then starting over somewhere else. The agent reverses the order: you state an intention, and it handles the chain of actions. A Doubao demonstration showed the assistant telling a story from a picture, erasing pedestrians from a photo, comparing the price of the same product across several shopping apps, then buying it once the user agreed.

ByteDance describes its product as "an operating-system-level collaboration" between its model and the manufacturer. The nuance matters: the agent does not live inside one app among others, it sits beneath them all, able to drive every one. Booking a trip, placing an order, downloading a batch of files, tracking a parcel across different platforms, all by voice, without opening the home screen even once.

The big players are moving the same way. Google has unveiled Gemini Spark, an assistant that draws on your emails and messages to carry out long tasks. OpenAI is preparing a device for the second half of 2026. Qualcomm already supplies the chips for glasses, rings and camera-equipped earbuds: form factors with no screen at all, and therefore no apps to touch. On those, the agent becomes the only possible interface.

The execution work handed back

What this shift moves is the work of execution. Comparing ten prices, filling in a booking form, copying an address from one app to another: these gestures take little effort but a lot of minutes, repeated every day. An agent that strings them together unsupervised hands the user back the most mechanical part of their digital life.

The gain is not only time, it is attention. The modern phone is built to hold you: every app wants its opening, its notification, its scroll. An agent acting quietly short-circuits that machinery. You make a request, you get a result, you close it. For anyone who would rather not live inside their screen, it is a form of withdrawal: the machine works, you stay outside.

For some users, the promise goes beyond comfort. An elderly person struggling with menus, someone with poor eyesight, a user uneasy with technology: a spoken intention is a far simpler door than the icon grid. Where the app demands that you learn how it works, the agent asks only that you say what you want.

What you are really delegating

The price of that delegation still has to be measured. To act for you, the agent must see everything: your messages, your emails, your accounts, your buying habits. Gemini Spark owns up to it, drawing "on your Google account." Where an app walled off its data, the agent gathers it all to function. Comfort is paid for in visibility: a single entity now knows the whole of what you do on the device.

Then there is the question of choice. When the agent compares ten prices and keeps one, on what criterion does it decide? When it buys "once you agree," how far does the agreement reach? Whoever controls the agent controls the first decision, the one that steers all the rest. An icon grid leaves the user free to choose a path; an agent offers one already drawn, and you rarely discover the ones it did not take.

Finally, dependence changes in kind. An app that crashes, you open another. An agent that becomes the only door to the phone concentrates the risk: its outage, its error, its misreading of an instruction carry consequences you do not always see coming, because you no longer follow each step. Buying the wrong ticket, writing to the wrong contact, paying twice: delegated autonomy assumes a trust that nothing, for now, guarantees.

The real battle

What is opening up, then, is not the race for a better assistant, but the battle over the interface itself. Whoever owns the agent owns the layer everything else passes through, and will reduce apps to invisible suppliers. ByteDance, Google, OpenAI and Qualcomm are not fighting over one more market: they are fighting over the place that the home screen has held since 2007.

For the user, the real question is not whether the agent can drive the phone, the demonstrations show it already can. It is deciding what you agree to stop doing yourself, and to whom you hand that power. The phone without apps promises to give back time and simplicity; in return it asks you to let a machine, and the company behind it, decide for you. The market will settle this quickly. The thing to weigh, beforehand, is what you are putting on the scale.