Book a call

Published: 4 March 2026 · Deployment · 9 min read

How to Scope an AI Project So It Ships in 30 Days

Scope creep, unclear success criteria and the wrong problem statement are the three reasons AI projects miss their dates. Here is how Forward Deployed AI Engineers prevent all three — and why the difference is almost entirely about where they sit, not what they know.

← Back to Insights

The scoping meeting started at nine. By eleven, the whiteboard was covered. Someone had added an integration with a legacy system that nobody had mentioned before. Someone else had suggested that, while they were at it, they should probably also automate the report that goes out every Friday. The original objective — a narrowly defined tool to reduce time spent on a specific manual triage task — had expanded into a platform. The timeline, which had been thirty days, was being quietly revised toward six months.

Nobody in that room was acting in bad faith. Each addition seemed reasonable in isolation. But collectively, they had just killed the project. Not that morning — it would take another three months for that to become apparent. The project would drift, miss milestones, run into integration problems nobody had planned for, and eventually join the very large pile of AI initiatives that promised a great deal and delivered nothing.

Gartner predicted in July 2024 that at least 30 per cent of generative AI projects would be abandoned after proof of concept by end of 2025, with unclear business value identified as a primary cause alongside poor data quality and escalating costs. McKinsey’s 2025 State of AI survey found that while 88 per cent of organisations now use AI in at least one business function, nearly two-thirds remain stuck in experiment or pilot mode — and only around six per cent qualify as what McKinsey terms “high performers,” generating more than five per cent EBIT impact from AI.

The gap between those two groups is not a technology gap. It is a scoping gap. But here is what the research does not say plainly enough: it is also a proximity gap. The organisations generating real AI value are not just better at writing project briefs. They have people embedded closely enough in the actual work to know, from direct observation, what the problem actually is — before a single line of code is written.

That is what a Forward Deployed AI Engineer does. And it is why the FDAE model produces different scoping outcomes than conventional delivery approaches, regardless of how disciplined those approaches are on paper.

“It is more important for your first few AI projects to succeed than to be the most valuable AI projects.” — Andrew Ng, AI Transformation Playbook, Landing AI (2018)

What a Forward Deployed AI Engineer actually is

Not a consultant. Not a vendor. Something else entirely.

The term comes from the Palantir model, refined and extended by Anthropic and a small number of serious enterprise AI firms over the past several years. A Forward Deployed AI Engineer — an FDAE — is not a remote developer who takes a specification and builds to it. They are not a consultant who runs workshops and produces a slide deck. They embed inside the organisation. They sit with the people who will use the system. They observe the actual workflow, not the documented version of it. They are present for the conversations that happen after the formal meetings end.

The distinction matters because the three failure modes that kill AI projects at the scoping stage — the wrong problem statement, scope creep, and undefined success criteria — are all fundamentally information problems. They occur because the people building the system do not have accurate enough information about what the system needs to do, what constraints it operates within, and what success looks like in practice. An FDAE, by being physically present inside the organisation from the start, has access to information that no amount of documentation or stakeholder interviews can reliably surface.

This is not a soft claim about culture or communication style. It is a claim about the quality of the inputs to the scoping process. Better inputs produce better problem statements. Better problem statements produce more constrained scope. More constrained scope produces systems that ship on time and get used. The chain is direct. The mechanism is proximity.

Failure mode one — and why FDAEs solve it differently

The wrong problem statement is an observation failure, not a thinking failure.

The most expensive mistake in AI project scoping is starting with the wrong problem. It is surprisingly easy to do. An organisation identifies that something is slow, expensive, or error-prone. Someone proposes AI as the solution. A project is initiated. What nobody pauses to check is whether the thing being automated is actually the bottleneck — or whether AI, even if perfectly implemented, would make any material difference to the outcome that matters.

BCG’s research on AI value creation draws a sharp distinction between companies that start from “use cases” and those that start from “value pools.” The use-case approach asks: what can we use AI to do? The value-pool approach asks: where do we actually lose time, money, or quality — and is AI the right intervention for that specific loss? These questions sound similar. They lead to very different projects.

A conventional AI delivery team approaches this problem through discovery workshops, stakeholder interviews, and requirements documentation. These are useful inputs. But they share a structural weakness: they capture what people say the problem is, not what the problem demonstrably is. People are not always good witnesses to their own workflows. They describe the visible pain points, not necessarily the ones with the largest downstream impact. They articulate the problem in terms of the solutions they have already imagined, which constrains the solution space before the problem has been properly diagnosed.

An FDAE embedded in the organisation for two weeks before any scoping document is written sees something different. They see where the coordinator pauses and opens a second window. They see which step in the process produces the most rework. They notice the workaround that everyone uses but nobody mentioned in the workshop because it has become invisible through familiarity. They hear the conversation between colleagues that reveals the real bottleneck — the one that is three steps removed from the task that was going to be automated.

Andrew Ng’s AI Transformation Playbook makes the point that AI projects should begin with a clearly defined and measurable objective confirmed against technical feasibility before kickoff. What the Playbook does not resolve — because it is a strategic framework rather than an operational model — is how you arrive at a clearly defined objective when the people most familiar with the problem are not in a position to articulate it accurately. The FDAE model is an answer to that question. You get the right problem statement by watching the problem, not by asking about it.

Failure mode two — and why FDAEs solve it differently

Scope creep is a boundary problem. FDAEs hold the boundary because they own the outcome.

Scope creep is not a new problem. The Standish Group’s CHAOS research has tracked it in software projects since the 1990s, and their findings consistently show that overrun is the norm rather than the exception in enterprise technology delivery. But scope creep in AI projects has a particular quality that makes it more destructive than in conventional development.

In a standard software project, adding a feature adds roughly proportional complexity. In an AI project, expanding scope often changes the nature of the data required. And data problems do not scale linearly. Gartner’s February 2025 analysis found that 63 per cent of organisations either lack or are unsure whether they have the data management practices required for AI — and predicted that through 2026, organisations will abandon 60 per cent of AI projects that lack AI-ready data. This is the mechanism through which scope creep specifically kills AI projects. Each addition to scope introduces new data requirements. Those requirements reveal new gaps. Those gaps require new infrastructure work. By the time anyone notices, the thirty-day delivery estimate has become a twelve-month programme and the business case has dissolved.

The conventional response to scope creep is a stronger project manager, a more rigorous change control process, and a more detailed scoping document. These are not useless. But they address the symptom rather than the cause. Scope additions persist because the people proposing them do not have a clear enough picture of what they cost — and because the people managing the project are not close enough to the underlying reality to assess those costs accurately in the moment.

An FDAE holds the boundary differently because they are accountable to a live system, not to a document. They are not managing a specification — they are building toward something that will run in production in thirty days and that they will still be responsible for after it ships. That accountability changes how scope additions are evaluated. When the Friday report gets suggested in week three, an FDAE embedded in the workflow does not reach for the change control form. They ask a more direct question: does this make the original objective easier to achieve, or does it compete with it? And because they have watched the workflow, they can answer that question accurately rather than on the basis of what sounds reasonable in a meeting room.

This is not about being obstructive. It is about being genuinely informed. The most useful thing an FDAE can say to a scope addition is not “that is out of scope” but “here is exactly what that would cost and what it would push out, and here is whether I think it is worth it.” That kind of response is only possible from someone who has been close enough to the problem to have real opinions about it.

“Among high performers, practices such as embedding AI into business processes and tracking KPIs for AI solutions are strongly correlated with achieving significant value.” — McKinsey & Company, The State of AI, 2025

Failure mode three — and why FDAEs solve it differently

Success criteria defined after the build are not success criteria. FDAEs instrument from day one.

McKinsey’s 2025 State of AI report is explicit about what separates high-performing AI organisations from the rest: they define and track KPIs for AI solutions, and they embed those solutions into business processes rather than running them alongside existing workflows. McKinsey found that high performers are nearly three times as likely to fundamentally redesign their workflows around AI — and that outcome measurement is one of the strongest individual correlates of achieving significant AI value. Yet only 39 per cent of organisations in the survey report any EBIT impact attributable to AI. The implication is not that AI is failing to deliver value in most organisations. It is that most organisations have not set up the measurement infrastructure to know whether it is.

The reason measurement gets deferred is structural. In a conventional project model, the team responsible for defining success criteria is not the same team that will ultimately measure them. Success criteria get written into a project initiation document by someone who may never interact with the delivered system. The people who will use the system are not involved in defining what success looks like. The result is criteria that are technically present but practically useless — too abstract to instrument, too disconnected from operational reality to be meaningful to the people whose work the system is supposed to improve.

An FDAE embedded in the organisation defines success criteria in the opposite direction. They start from what they have observed: the twelve-minute triage task, the coordinator who cross-references three systems before making a decision, the error rate on the weekly reconciliation. These are observable, measurable realities. The success criteria that emerge from this starting point are specific enough to instrument, grounded enough in the actual workflow to be meaningful to end users, and direct enough to support a genuine before-and-after comparison.

More importantly, an FDAE instruments those metrics from the start rather than at the end. The baseline measurement — what does the current process actually cost, in time and error rate, before the AI exists — gets taken in the first week of the engagement, not as a retrospective after the system has shipped. This is not a minor operational detail. It is the difference between a delivered system that can demonstrate its value and one that cannot. And it is the difference between a project that gets funded for a second phase and one that winds down by default because nobody can tell a clear story about what it achieved.

The discomfort that typically surrounds baseline measurement — because it sometimes reveals that estimated savings were based on assumptions rather than observation — is exactly the discomfort that proximity resolves. An FDAE who has watched the workflow does not need to estimate. They have seen it.

Problem statement

Observed, not elicited

FDAEs watch the workflow before they scope the solution. The problem statement comes from direct observation, not from what stakeholders say in a workshop — which is reliably different.

Scope control

Accountable to a live system

FDAEs hold scope because they own the outcome — not a document. They evaluate additions against what they have built and what they know about the data, not against a specification written three weeks ago.

Success criteria

Instrumented from day one

The baseline gets measured in week one. Success criteria are specific enough to track, grounded in what was actually observed, and instrumented before the build begins — not written into a document nobody reads after kickoff.

Governance

Built in, not bolted on

Because FDAEs are present during the compliance and governance conversations — not handed requirements from them — the system is designed to pass review rather than rebuilt to do so.

What the scoping session produces

Five outputs, not fifty. Each one earned, not assumed.

A forward deployed scoping process produces five things before any engineering begins. They are not unique to the FDAE model — any rigorous scoping process would produce them. What is different is how they are arrived at, and therefore how reliable they are.

The first is a problem statement written in operational terms. Not “improve the efficiency of the referral process” but “reduce the time a coordinator spends categorising each incoming referral from an average of twelve minutes to under three.” That specificity is what makes scope additions visible as either aligned with the objective or irrelevant to it. An FDAE can write this statement because they have timed the task, watched it vary across different coordinators, and understood which step in the process consumes the most time. A team working from a brief cannot do this with the same confidence.

The second is a confirmed data inventory. Not theoretical — not “we have records in the CRM” — but confirmed by someone who has looked at the actual data, understood its structure, and assessed its quality against what the AI component requires. Gartner’s February 2025 finding that 60 per cent of AI projects are at risk from data-readiness failures is a finding about projects where this confirmation never happened. An FDAE embedded in the organisation can get to the actual data in a way that a remote team working from documentation cannot.

The third is a success metric with a baseline. Taken in week one, not estimated afterward. The McKinsey finding that tracking KPIs is one of the strongest correlates of AI high performance is a finding about organisations that measure outcomes rather than presuming them. The FDAE model makes this natural rather than uncomfortable, because measurement is part of the observation process rather than a separate administrative step.

The fourth is a human oversight model. Who reviews the AI’s outputs before they become decisions? What happens when the system produces something unexpected? An FDAE embedded in the team understands the governance requirements not from a compliance brief but from the conversations they have been part of since week one. In regulated environments this is a clinical or legal requirement. In all environments it is a trust requirement — and a system that end users do not trust will not be used, regardless of its technical quality.

The fifth is an explicit out-of-scope list. The things you are not building, agreed and signed off by stakeholders. This is the most reliably omitted output of a conventional scoping session. It is also the most valuable. Writing it requires having an opinion about what is and is not necessary to achieve the specific objective — and that opinion can only be held with confidence by someone who has spent enough time inside the problem to know what matters and what is adjacent noise.

The 30-day constraint

The deadline is a diagnostic. FDAEs use it to test the scoping, not the engineering.

The 30-day target is a forcing function, not a delivery guarantee. Its purpose is to surface scoping problems before they become build problems. If a project cannot be shaped to produce something live and useful within thirty days, that is information — and it is far cheaper to receive that information before the build starts than three months into it.

When a forward deployed team applies the 30-day constraint to a proposed scope, three things can happen. The scope is tight enough, the data is ready, and the success criteria are clear: in that case, thirty days is achievable and the constraint has confirmed the scoping is sound. Or the scope is too broad, in which case the constraint makes that visible and the team divides the problem into a sequence of smaller objectives, each of which can be addressed independently. Or the data situation is not ready, in which case the constraint exposes that before it surfaces as a build failure, and the first thirty days get invested in addressing the data rather than building an AI that will fail when it encounters it.

Andrew Ng’s AI Transformation Playbook argues that for most organisations, early AI projects should prioritise success and momentum over maximum ambition. The value of a successful early deployment is not just the output of that specific system. It is the organisational confidence, the refined understanding of how AI delivery works in this particular environment, and the evidence base that unlocks investment in the genuinely transformative initiatives that follow. A modest system that ships in thirty days and is used every day by the people it was built for is worth more than an ambitious platform that never reaches production.

An FDAE is the mechanism by which that momentum becomes achievable. Not because they are faster engineers than the alternative — they are not necessarily — but because the quality of the information they work from produces fewer false starts, fewer late-stage scope changes, and fewer systems built for a problem that turned out not to be the right one.

The condition that makes the rest possible

Proximity is the prerequisite. Everything else follows from it.

There is one condition without which even a well-scoped project will stall: someone has to own the outcome, and they have to be close enough to it to know when it is at risk. McKinsey’s 2025 State of AI report finds that high-performing organisations tend to have senior leaders who demonstrate strong ownership and commitment to AI initiatives. That finding is important. But it describes the organisational condition, not the delivery condition. The delivery condition is that the people building the system are accountable to the people who will use it — daily, directly, without a layer of project management in between.

This is what the FDAE model provides that conventional delivery approaches do not. The engineer who spent two weeks watching coordinators work before writing a scoping document is accountable to those coordinators in a way that a remote team is not. When the system ships and something does not work the way it should, they are the person in the room. When a coordinator finds a workaround that reveals a gap in the design, the FDAE hears about it immediately rather than in the next sprint review. When the data quality issue that everyone had assumed would be fine turns out not to be fine, they encounter it in the first week rather than the fourth.

This accountability is what produces scope discipline in practice. The scoping document is not what holds a project together. The relationship between the person building the system and the people who will use it is what holds a project together. That relationship is only possible through proximity — and proximity is only possible if the engineers are embedded inside the organisation from the start, not parachuted in at milestones.

The organisations generating measurable AI value in 2026 — the six per cent McKinsey identifies as high performers — did not start by attempting enterprise-wide transformation. They solved a specific problem, measured the result, and used that success as the foundation for the next initiative. The FDAE model is the delivery mechanism that makes that sequence of specific, successful, measurable projects possible. It is not a methodology. It is a posture. And the posture is this: be close enough to the problem that you cannot misunderstand it, and accountable enough to the outcome that you cannot pretend you have solved something you have not.

Sources referenced in this article
  • Gartner (July 2024). Gartner Predicts 30% of Generative AI Projects Will Be Abandoned After Proof of Concept By End of 2025. Gartner Newsroom. Analyst: Rita Sallam, Distinguished VP Analyst.
  • Gartner (February 2025). Lack of AI-Ready Data Puts AI Projects at Risk. Gartner Newsroom. Analyst: Roxane Edjlali, Senior Director Analyst.
  • McKinsey & Company (2025). The State of AI in 2025: Agents, Innovation, and Transformation. QuantumBlack, AI by McKinsey. Authors: Alex Singla, Alexander Sukharevsky, Bryce Hall, Lareina Yee, Michael Chui.
  • BCG Henderson Institute. AI at Scale: Moving from Use Cases to Value Pools. Boston Consulting Group. Research on enterprise AI value creation patterns.
  • Ng, Andrew (2018). AI Transformation Playbook: How to Lead Your Company into the AI Era. Landing AI. Available free at landing.ai.
  • Standish Group (various editions). CHAOS Report. Ongoing research into software project delivery outcomes across the enterprise technology industry.

Work with us

Ready to scope something that actually ships?

Our Forward Deployed AI Engineers embed inside your team, observe the actual problem, and produce a scoped 30-day delivery plan before any build begins — with a confirmed data inventory, a measured baseline, and success criteria that mean something. If you are in NHS, local government, universities or enterprise, let’s talk.

Book a discovery call Back to home