Tech & Innovation

The Junior Consultant Was Always the Real Product. Agentic AI Just Made That Business Model Impossible to Ignore.

March 30, 2026 · 5 min read

Key Takeaways

McKinsey's Lilli platform can perform roughly 80% of junior analyst research and slide-generation work, while graduate consulting job postings dropped 44% year-on-year by 2024 — the displacement is structural, not cyclical.
The leverage model's profitability (target gross margin above 50%) depended on billing junior labor at partner rates; AI now produces that same output at near-zero marginal cost, but client-facing pricing has barely moved.
PwC's $1.5 billion Agent OS investment and Accenture's AI Refinery are permanent infrastructure replacements for junior-layer functions, with PwC explicitly positioning agents 'directly in front of clients' to cut out consultant intermediaries.
Engagements are completing 30-40% faster with AI, but fees are largely unchanged — the efficiency is trapped inside firm P&Ls until clients gain enough AI sophistication to demand pricing concessions.
AI-native boutiques like Monevate and Unity Advisory (which raised $300M in private capital) are proving the post-leverage model works; incumbent firms face internal partnership incentives that resist the restructuring they need most.

The consulting pyramid was never really about insight. It was about arbitrage. Partners sell credibility and relationships. Junior consultants deliver the actual work: data pulls, market sizing, financial modeling, slide assembly. Firms bill all of it at blended rates that reflect the partner's brand, not the analyst's output. McKinsey's Lilli platform, now used by 72% of the firm's 45,000 employees, can perform roughly 80% of that junior analyst work. The pyramid's economic foundation is not under threat. It is already cracking.

The Leverage Model's Open Secret: Junior Consultants Were Always the Margin Engine

The economic structure of consulting is straightforward. Firms target gross margins above 50% and net margins above 20%, achieved through high leverage ratios: a typical engagement pairs one partner with six to eight junior staff, with the partner billing at a premium rate while analysts generate output at a fraction of that cost. The difference flows to the partnership as profit. Junior consultants are, in accounting terms, the cost of goods sold dressed up as intellectual capital.

This was never a secret to anyone inside the industry. The traditional pyramid organizes staff into finders (partners securing relationships), minders (managers coordinating delivery), and grinders (analysts performing technical tasks). Grinders exist to scale partner time without scaling partner headcount, and their billing rates sit well above their loaded cost. The efficiency of that arbitrage determines partner compensation as much as any client outcome does. What changed in 2025 and 2026 is that the cost-of-goods now comes preinstalled in AI infrastructure, not in analyst cohorts.

What McKinsey's Lilli Platform Actually Displaced

Lilli processes approximately 500,000 queries per month and saves users 30% of research and synthesis time. That 80% figure for junior analyst task coverage represents a near-total substitution of the grinder function at the information-gathering layer. McKinsey is no longer incurring the salary, benefits, training, and supervision costs associated with producing that output at anything like its previous scale.

The hiring signal arrived before any official announcement. Graduate job postings in consulting dropped 44% year-on-year by 2024. KPMG UK cut its graduate intake by 29%, Deloitte UK by 18%, EY by 11%. Starting salaries at MBB have been frozen for a third consecutive year. McKinsey is considering cuts of up to 10% across certain functions, potentially eliminating several thousand roles over the next 18 to 24 months. Two senior Big Four executives estimated that UK graduate recruitment would fall by roughly half in the coming year.

This is not a demand-side hiring dip. Consulting revenue has held up. The firms have simply discovered that delivering the same scope requires fewer bodies when those bodies were performing tasks that a well-tuned AI system handles faster, cheaper, and without a return flight to bill.

Accenture's AI Refinery and PwC's Agent OS Are Infrastructure Decisions, Not Experiments

When a major firm builds proprietary AI infrastructure rather than subscribing to a third-party tool, it is making a permanent organizational bet. Accenture's AI Refinery platform deploys pre-configured industry agents encoded with business workflows, allowing firms to rapidly build and scale multi-agent networks across client engagements. Its December 2025 collaboration with OpenAI extends ChatGPT Enterprise access to tens of thousands of Accenture professionals. These are infrastructure replacements for junior-layer functions, not tools bolted onto an unchanged delivery model.

PwC's investment is more explicit still. The firm has committed nearly $1.5 billion to next-generation AI capabilities, including its Agent OS platform, which now orchestrates over 250 deployed agents across internal workflows. PwC built over 120 additional agents in collaboration with Google Cloud, covering end-to-end business transformation processes. Agent OS claims up to 95% reduction in manual effort when deployed across full workflows rather than individual tasks.

The strategic intent is where the real signal sits. PwC is, according to The New Stack, "putting its AI agents directly in front of clients and cutting out the traditional back-and-forth with consultants as intermediaries." That is not a productivity upgrade. That is a structural removal of the consultant from a portion of the value chain, with the firm billing directly for agent output. The leverage model does not disappear in that scenario — it gets reassigned to software rather than headcount.

The Billing Question No Partnership Wants to Answer

Engagements that once took ten weeks now complete in six to seven. The reduction in delivery cost runs 30-40% according to ConsultingQuest's analysis of AI's impact on consulting economics. Client-facing pricing has not moved to reflect this. The efficiency is, in ConsultingQuest's framing, "largely trapped inside the firms' P&Ls."

Partners characterize this as earned efficiency — payback for years of procurement-driven fee compression. That framing holds right up until procurement teams start running their own AI tools to estimate how long a given scope should take. Gartner projects that 40% of enterprise applications will feature task-specific AI agents by the end of 2026, up from less than 5% in 2025. Sophisticated buyers will increasingly understand what AI-delivered work costs and what it should be priced at. When that happens, the opacity protecting current margins compresses rapidly.

The deeper problem is that consulting's core currency is credibility, and pricing that cannot survive scrutiny eventually becomes a credibility liability. As ConsultingQuest's analysis argues directly: "If consultants want a share of the upside, clients deserve a share of the efficiency."

McKinsey has moved roughly 25% of its global fees to outcome-based contracts. Outcome-based pricing removes the utilization-rate mechanism entirely. When a firm cannot bill by the hour for junior staff because there are no junior staff, it must find another unit of value to monetize — and the discipline of defining that unit is an exercise the partnership structure systematically resists.

What the Post-Leverage Consulting Firm Actually Looks Like

The firms closest to a workable answer are AI-native boutiques that never built the pyramid in the first place. HBR's 2025 structural analysis profiles Monevate, which delivers pricing strategy without an analyst layer, and SIB, which deploys AI agents for cost reduction work and brings in human experts only at decision points. Unity Advisory, which raised $300 million in private capital, operates as conflict-free and AI-native by design, eliminating the classic leveraged structure entirely.

These firms cannot yet match the brand power or relationship depth of MBB. Their unit economics, however, are fundamentally superior at every level below partner. They do not carry the overhead of cohort training, the supervision cost of junior staff, or the cultural resistance of a partner compensation structure built on headcount management.

The incumbents face a structural inertia problem that is genuinely difficult to solve from the inside. Partner compensation is tied to utilization rates and team size. Promotion structures reward managing large junior cohorts. Every incentive in the traditional partnership model actively resists the optimization that AI makes possible. Firms that move will face partner revolts before they face client complaints, which is why the restructuring will be slower than the technology warrants and faster than partnerships prefer.

The firms that survive the next decade are those that restructure their economics before clients — armed with their own agentic AI capabilities and a clearer view of delivery costs — restructure it for them.

Frequently Asked Questions

Will agentic AI fully eliminate the junior consultant role?

Full elimination is unlikely in the near term, but severe reduction is already underway. McKinsey's Lilli performs roughly 80% of junior analyst research and slide-generation tasks, and graduate consulting job postings dropped 44% year-on-year by 2024. The roles that survive will shift toward AI facilitation and output validation rather than primary analysis.

How are consulting firms currently pricing AI-delivered work?

Most major firms have not adjusted client-facing pricing to reflect AI-driven efficiency gains. ConsultingQuest's analysis finds engagements completing 30-40% faster with fees unchanged, meaning the productivity gain is retained as margin rather than passed to clients. Approximately 25% of McKinsey's global fees are now tied to outcome-based contracts, which signals an early shift away from pure time-and-materials billing.

What is PwC's Agent OS and why does it matter for the consulting model?

Agent OS is PwC's proprietary platform for orchestrating networks of AI agents across enterprise workflows, backed by a nearly $1.5 billion investment and over 250 deployed agents internally. Its strategic significance lies in PwC's stated intent to place agents directly in front of clients, removing consultant intermediaries from portions of the delivery chain and billing for agent output rather than billable hours.

Are AI-native boutiques a real competitive threat to MBB and Big Four firms?

At current scale, the brand and relationship gaps are wide. But boutiques like Unity Advisory (with $300 million in private capital) and Monevate are proving that the post-leverage delivery model is viable and financially superior at the unit level. As clients gain AI sophistication and procurement teams question traditional fee structures, the boutiques' cost advantage will become increasingly legible to buyers.

What happens to consulting partner compensation as the leverage model breaks down?

Partner compensation tied to utilization rates and team headcount faces structural pressure as junior cohorts shrink and AI handles their prior output. The move toward outcome-based and subscription-style contracts, which McKinsey and others are piloting, requires a fundamentally different compensation calculus — one that rewards client impact and relationship depth rather than the management of billable pyramid layers.

← Back to Blog