AI in the Enterprise: What Most Companies Get Wrong

· 8 min read

ai, enterprise, leadership, strategy, engineering-management

Names and details have been changed. The situations are real. The people are not the point.

The hype problem

Most enterprise AI initiatives fail because companies start with the technology instead of the problem. They build chatbots before fixing their data, announce strategies before understanding their bottlenecks, frame AI as a job replacer instead of a task augmenter. After leading AI adoption inside a platform engineering team, here’s what actually works and what doesn’t.

I was in a meeting last autumn. Twelve people around a table, a slide deck on screen with “AI-Powered Transformation” in bold across the top. The presenter, someone senior from strategy, was walking through a roadmap that had the company deploying a customer-facing AI assistant within the quarter. Bullet points about “leveraging LLMs” and “intelligent automation.” The room nodded along. I looked at the engineering manager next to me. She had her laptop open to our data quality dashboard. Thirty percent of our customer records had missing or inconsistent fields. The data pipeline that would feed this assistant had been flagged for three months as unreliable. Nobody in the room mentioned either of those things. The gap between the slide and the reality was so wide you could have parked a bus in it. I sat there thinking: this is where initiatives go to die. Not in the code. In the conference room, two months before anyone writes a line of code.

Every board deck in 2026 has an AI slide. Every product roadmap has an AI feature. Every job description mentions AI experience.

Most of it is performance. To be fair, some of it isn’t. I’ve seen a handful of teams skip the theatre entirely and ship something useful within weeks. They exist. They’re just quiet about it.

I lead a platform engineering team. I see how AI adoption actually plays out inside a company, not on stage at a conference. The gap between what leadership announces and what teams deliver is probably the most interesting engineering problem I’ve seen in years. Not because of the technology. Because of the humans. But pipelines don’t have opinions. People do. That’s where it gets interesting.

The three lies companies tell themselves

”We need an AI strategy”

No. You need a strategy. AI is a tool that might support it.

Starting with “how do we use AI?” is like starting with “how do we use databases?” Moving cities taught me that every system looks obvious from the inside. AI strategy is the same. The question is backwards. Start with the actual problems. What decisions do we make repeatedly that would benefit from better data? What processes burn time without adding value? Where do our people spend hours on work a machine could do in seconds?

The companies that get AI right don’t have an AI strategy. They have a clear understanding of their bottlenecks, and AI happens to solve some of them.

”Let’s start with a chatbot”

The default first move. Slap a chatbot on the website, on internal docs, on customer support. The enterprise equivalent of “hello world.”

Chatbots are visible. Leadership can demo them. And in fairness, at scale — airlines, banks, telecom — some chatbot implementations genuinely work and save millions of hours of human time. But they’re also the hardest AI application to get right. Natural language is ambiguous, context windows are limited, hallucinations are a liability. You’re starting with the most complex use case because it looks impressive, not because it delivers value.

Start boring. Classification, extraction, summarisation of structured data. The tasks nobody sees. We ran one of those last year. A model that classified incoming support tickets by urgency and routed them to the right team. No chatbot, no interface anyone would demo at an all-hands. Just a classifier sitting in the pipeline. It cut average first-response time by 40% across a team of 15. Nobody put that on a keynote slide. But the support team noticed within the first week. One of the agents sent a message in Slack: “I don’t know what changed but my queue actually makes sense now.” That felt better than any demo. That’s the kind of win that compounds.

”AI will replace jobs”

AI doesn’t replace jobs. It replaces tasks. The distinction matters more than most leadership teams want to admit.

A customer support agent who spends 60% of their time writing templated responses and 40% handling complex escalations doesn’t get replaced by AI. The 60% gets automated. The agent now handles more complex work, faster, with context generated by AI.

The companies that frame AI as a replacement tool create fear. Fear kills adoption. Every time. The companies that frame it as an augmentation tool create curiosity. Curiosity drives the experimentation that actually works.

What does an engineering team actually need for AI?

I’ve watched AI initiatives fail for the same reasons, repeatedly.

Bad data foundations. You can’t build AI on data you don’t trust. If your team spends a week cleaning data for every AI experiment, you don’t have an AI problem. You have a data engineering problem. Fix that first.

No feedback loop. A model that ships without a feedback mechanism is a guess that never improves. You need to measure what the model does, collect signals on whether it was useful, retrain or adjust. The pipeline matters more than the model. Always.

Over-engineering the first iteration. The first version should be embarrassingly simple. A rules-based system that handles 70% of cases is more valuable tomorrow than a fine-tuned model that handles 95% in six months. Ship the simple version. Learn. Iterate.

Ignoring the human interface. The best AI system in the world is useless if people don’t trust it. Show the confidence score, explain the reasoning, make it easy to override. Trust is built one correct prediction at a time.

How do you decide where AI fits?

Before asking “should we use AI for this?”, run it through three filters.

Volume. Is this task performed hundreds or thousands of times? AI at low volume is overhead, not optimisation.

Consistency. Is the expected output predictable? If a human would give the same answer 90% of the time, a model can learn that pattern. If the answer depends heavily on context that changes every time, AI will struggle.

Tolerance for error. What happens when the model is wrong? If a wrong classification means a support ticket goes to the wrong queue, you’ll live. If a wrong classification means a patient gets the wrong treatment, you won’t. Know your error budget before you write any code.

AI Readiness Check Four questions. Honest answers only.
01

How structured is your company's data?

02

How well-defined are your repetitive processes?

03

How does your team react to new tools?

04

What is your primary goal with AI?

FilterQuestion to askGood fitPoor fit
VolumeHow often is this task performed?Hundreds/thousands of timesOne-off or rare tasks
ConsistencyIs the expected output predictable?Same answer 90%+ of the timeHighly context-dependent
Tolerance for errorWhat happens when the model is wrong?Low-cost misclassificationPatient safety, legal risk

The leadership angle

The hardest part of AI adoption isn’t technical. It’s cultural.

As an engineering leader, my job isn’t to build AI systems. It’s to create the conditions where AI adoption can succeed.

Protecting experimentation time. If the team is at 100% capacity on feature work, there’s no room to explore AI. You need slack. I protect 20% of sprint capacity for technical exploration, and AI experiments live there.

Translating between business and engineering. This is where most initiatives die. I had a conversation last quarter that stuck with me. The business said “we want AI-powered insights on customer behaviour.” The engineer who heard that started scoping a recommendation engine, embeddings, a whole ML pipeline. The actual need, when I sat down with the product manager and asked her to show me the decision she wanted to make differently, was “sort this list of accounts by likelihood of churn so I can call the risky ones first.” That’s not a recommendation engine. That’s a ranked query with a simple model behind it. Shipped in two weeks. The “AI-powered insights” version would have taken three months and delivered the same outcome. Translation is the job.

Setting honest expectations. A model that’s 80% accurate isn’t a failure. It’s a starting point. I’m still figuring out the best way to communicate this to non-technical stakeholders. But if leadership expects 99% accuracy on day one, your team is set up to disappoint. Frame the conversation correctly before the first line of code gets written.

Making it safe to fail. The first three AI experiments might not work. That’s normal. If failure means blame, people stop experimenting. If failure means learning, people push further. This isn’t unique to AI. The same dynamic plays out in agile adoption. But AI amplifies it because the uncertainty is higher.

What I’m personally betting on

I don’t do predictions. Predictions are cheap. But I can tell you where I’m putting my own time.

I’m using Claude Code for both generation and review. It writes boilerplate, components, tests. But the real value is the review layer on top. You generate fast, then you review hard. The code that ships has to meet the same standard whether a human or a machine wrote it. That discipline is what separates useful AI adoption from expensive autocomplete.

I’m also watching what OpenAI, Anthropic and Google DeepMind are doing with structured data extraction. Every company I’ve worked at has contracts, invoices, support tickets and meeting notes rotting in unstructured formats. The teams that can turn those into queryable data will move faster than everyone else.

And internal knowledge systems. Not chatbots. Real systems that surface the right document at the right moment. Tools like Notion and Glean are getting close. The institutional memory that lives in the heads of people who might leave next quarter… that’s the real risk nobody talks about in board meetings.

I might be wrong about all three. But I’d rather bet on boring infrastructure than shiny demos.

Where this leaves us

Fix your data. Pick one boring problem. Ship a simple solution. Measure the result. Iterate.

That’s it. That’s the whole strategy. Everything else is theatre.