The AI Arms Race: What Actually Matters for Your Business

TL;DR

In the last week alone: OpenAI released a model that beats humans at office work, Anthropic accidentally leaked an always-on AI agent, Apple confirmed Siri is getting a brain transplant from Google, and OpenAI raised $122 billion. The headlines are exhausting. Most of it doesn't matter to your business yet. Three things do: AI can now reliably do multi-step work, AI assistants are becoming how customers find you, and the cost of waiting is going up. Everything else is noise.

This Week Was Absurd

I keep a running list of AI announcements that are relevant to my clients. Most weeks it's one or two items. This week I stopped counting at nine.

OpenAI released GPT-5.4, which scored above human baseline on a benchmark simulating actual desktop productivity tasks. Anthropic's source code leaked through an npm packaging error, revealing an unreleased always-on AI agent called KAIROS that runs continuously in the background. Apple confirmed that iOS 26.4 will ship Siri rebuilt on Google's Gemini AI — affecting 2.2 billion devices. OpenAI raised $122 billion. Anthropic is approaching $19 billion in annual revenue. Google expanded its AI-powered search to over 200 countries. Alibaba launched an enterprise AI agent platform specifically for small businesses.

If you tried to act on every one of these announcements, you'd never get any actual work done. So I'm going to do what I do for clients: separate what matters now from what matters later from what doesn't matter at all.

What Matters Now

AI can do real multi-step work, not just answer questions. This is the shift that's been building for a year and just became undeniable. GPT-5.4 scoring 75% on OSWorld-V — above the human baseline of 72.4% — isn't just a benchmark number. It means the model can open applications, navigate between systems, extract data, make decisions, and complete tasks. Autonomously.

I've been building systems like this manually for clients all year. A property management client has a workflow agent that processes tenant applications across four different systems. A jewellery business has one that syncs inventory changes to three sales channels in real time. These used to be custom projects that took weeks. The models are getting good enough that the setup time is shrinking fast.

If you've been thinking “AI can't handle anything more complex than writing an email,” that stopped being true about six months ago. The question now is which specific multi-step process in your business is worth automating first.

AI assistants are becoming a customer acquisition channel. The Apple-Gemini deal is the one that keeps me up at night — in a good way, professionally speaking. When 2.2 billion devices get an AI assistant that can see your screen, understand context, and complete transactions, the way customers discover businesses fundamentally changes.

“Find me a plumber available tomorrow” stops being a Google search and becomes a Siri request. And Siri doesn't show ten blue links. It shows one recommendation. Maybe two. If your business isn't structured for AI assistants to find, understand, and interact with you, you're not in the running.

This isn't theoretical. I've already had the conversation with a salon client whose competitor was showing up in Siri results and they weren't. The fix was straightforward — structured data on their website, a complete Apple Maps listing, online booking connected — but the businesses that do this now get a head start that compounds.

The parallel to remember: In 2010, businesses that understood SEO early got years of free traffic while competitors paid for ads. We're at the same inflection point with AI-powered discovery. The businesses that structure their information for AI assistants now will be the default recommendations when everyone else catches up. First-mover advantage here is real and durable.

What Matters Later

Always-on AI agents. The KAIROS leak from Anthropic is fascinating — an AI agent that runs 24/7, watches your projects, takes autonomous action, and consolidates its own memory overnight. I've essentially built a version of this for my own consultancy: an operations system that monitors client deployments around the clock and alerts me when something needs attention. So I know this pattern works.

But for most SMEs? Not yet. The tooling isn't ready for non-technical users. The trust model isn't established — do you really want an AI sending emails on your behalf while you sleep before you've tested it extensively? And the use cases that genuinely need 24/7 autonomous operation are narrower than the hype suggests.

Keep an eye on KAIROS. When Anthropic launches it publicly (likely by summer), it'll be worth evaluating. But don't feel behind because you haven't got a persistent AI agent running yet. You're not.

Multi-agent systems. The idea of multiple specialist AI agents coordinating — one handling finance, one handling customer queries, one managing operations — is coming. I'm experimenting with this for my own work. But for a 10 or 20-person business, a single well-configured agent or workflow automation will deliver 90% of the value at 10% of the complexity. Multi-agent architectures are a 2027 conversation for most SMEs.

What Doesn't Matter (Yet)

The funding numbers. OpenAI raised $122 billion. Anthropic is approaching $19 billion in revenue. These are staggering numbers and they're completely irrelevant to your business decisions. The fact that these companies are well-funded means the tools will keep improving and the prices will keep dropping. Great. But it doesn't change what you should do this quarter.

Model comparisons. GPT-5.4 vs Claude Opus 4.6 vs Gemini 3.1 Pro — I test all of them, and the honest answer is they're converging. For most business applications, the difference between them matters less than how well you set up the system around them. A well-designed workflow on a “worse” model will outperform a poorly designed one on the “best” model every time. Pick one, build with it, switch later if needed. Don't agonise.

The hype around specific model features. “1 million token context windows!” “6 trillion parameters!” These are marketing numbers. What matters is: can the tool do the specific job you need done? I've solved real client problems with relatively simple prompts and basic API calls. The most powerful model in the world is useless if you haven't identified the right process to automate.

The Cost of Waiting

A year ago, I'd tell cautious clients that there was no rush. “The tools are still maturing. Wait six months and you'll have better options.” I don't say that anymore.

The tools are mature enough. The cost-benefit is clear enough. And critically, the competitive dynamics are shifting. If your competitor automates their tenant application processing and you don't, they're faster. If your competitor's business is structured for AI discovery and yours isn't, they get the referral. These advantages compound.

I'm not saying panic. I'm saying the window where “we'll get to AI eventually” was a reasonable position is closing. Three things you should be doing right now:

Pick one repetitive multi-step process and automate it. Invoice processing, data entry, application handling, booking management — whichever one eats the most hours. Use a workflow tool like n8n (a workflow automation platform) or Make, connect it to an AI for the parts that need understanding (reading documents, classifying emails, extracting data), and get it running. This is a week-long project, not a six-month initiative.
Make your business findable by AI. Add structured data to your website. Complete your Apple Maps and Google Business profiles. Make sure your services, hours, prices, and booking options are machine-readable, not just human-readable. This is a one-day job for most small businesses.
Try the tools yourself. Use Claude or ChatGPT for actual work tasks for a week. Not toy examples. Real tasks — drafting proposals, analysing data, processing documents. You'll learn more about what AI can and can't do in five days of real use than in a year of reading articles about it. Including this one.

My Honest Take

I've been doing AI consulting for small businesses for a while now, and this is the first time the gap between “what's possible” and “what SMEs are actually doing” has felt genuinely uncomfortable. The tools are there. The costs are reasonable. The use cases are proven. But most small businesses haven't started.

The ones that have? They're already seeing the results. One client's team got back the equivalent of a full working day per week. Another's customer enquiry response time dropped from hours to minutes. A third stopped losing sales to stock discrepancies across their platforms.

None of these required cutting-edge AI or six-figure budgets. They required someone to look at the business, identify the right process, and connect the right tools. That's it.

The arms race between OpenAI, Anthropic, Google, and the rest will continue. Models will keep getting better. Prices will keep dropping. New capabilities will keep emerging. But your business doesn't need to track every development. It needs to pick the three things that matter right now and execute on them. The rest is spectacle.