What "Level 5 Architecture" Actually Means
Two frameworks share the "Level 5 architecture" label. OpenAI's five-level AI capability framework, where Level 5 = AI that can run an entire organization. And the academic agent-autonomy framework, where Level 5 = fully autonomous multi-agent systems. Both matter for this argument, and today's most ambitious deployments are operating at Levels 3–4 of either— not 5.
OpenAI's five levels were shared with employees on July 9, 2024, according to Bloomberg's reporting on the internal classification system1:
| Level | OpenAI Capability Framework | Agent Autonomy Framework |
|---|---|---|
| 1 | Chatbots — conversational AI | Human-led, AI assists discrete tasks |
| 2 | Reasoners — human-level problem solving | AI executes with human approval per step |
| 3 | Agents — can take actions on behalf of users | AI executes multi-step plans, human reviews output |
| 4 | Innovators — can aid in invention | AI plans and executes, human supervises domain |
| 5 | Organizations — can do work of an entire organization | Fully autonomous multi-agent systems |
OpenAI executives told staff the company was at Level 1 at the time, on the cusp of reaching Level 21. The AI Insider's parallel breakdown2 confirms the same five-tier sequence— Chatbots, Reasoners, Agents, Innovators, Organizations. An academic preprint3 formalizes the second framework, describing Level 5 as "fully autonomous multi-agent systems" with examples like Cognition's Devin and GitHub Copilot Workspace operating in research and early production.
Devin is the most concrete reference point for what "Level 5-adjacent" looks like in production. Cognition's own annual review4 is candid about what the system actually does at scale. "Devin replaces tasks, not roles, acting as a force multiplier," the company writes, with the caveat that "architectural decisions and high-level product logic still require human oversight." IBM's coverage5 confirms the production deployments— Goldman Sachs, Santander, Nubank, and engineering teams at thousands of companies, all operating on the same hybrid model.
The most advanced architectures imaginable, deployed at the most demanding firms in the world, still pair AI with senior human judgment. Level 5 is the thought experiment. The reality is more interesting— and the labor data on who AI is actually replacing reveals what that ceiling means for staffing.
The Data Nobody Is Talking About — AI Is Seniority-Biased
Stanford's primary research shows AI is seniority-biased— substituting for entry-level workers while complementing the tacit knowledge of senior professionals. Workers ages 22–25 in AI-exposed occupations have experienced a 16 percent relative employment decline. Older workers in the same roles have stayed stable or grown.
The Stanford Digital Economy Lab paper6, published November 2025 and titled "Canaries in the Coal Mine," uses high-frequency administrative payroll data from ADP to track what's actually happening across the U.S. labor market. The paper's headline finding: early-career workers (ages 22–25) in the most AI-exposed occupations experienced a 16% relative decline in employment after controlling for firm-level shocks, while employment for more experienced workers in the same occupations remained stable or grew6.
A companion finding from the working paper7 sharpens it. The declines are concentrated in occupations where AI is more likely to "automate, rather than augment, human labor." Where AI augments, no displacement. Software engineers ages 22–25 have seen employment fall nearly 20% since late 2022, according to Stanford postdoctoral researcher Bharat Chandar8.
The mechanism has a name. Brynjolfsson and colleagues call it seniority-biased technological change9. It's distinct from the skill-biased technological change economists tracked through the 1990s and 2000s. Tacit knowledge— the kind learned on the job and hard to verbalize— is what AI can't replicate, and it's what experienced workers carry.
Three mechanisms drive the bias:
- Codifiability. Tasks that can be written down clearly are easiest to automate. Junior work tends to live there.
- Tacit knowledge. Pattern recognition and contextual judgment built through years of execution don't transfer into prompts.
- Situational judgment under stakes. When the cost of being wrong is real, AI assists but doesn't decide.
The contrasting view deserves a fair hearing. Microsoft AI chief Mustafa Suleyman has predicted human-level performance on most professional tasks within 12–18 months10, and he's not alone in the prediction market. Stanford economist Erik Brynjolfsson holds a different view from the same data: "I don't want to stop automation. What I'd like to do is increase augmentation and restore a little bit of a balance so that we do them equally"11. The trajectory of AI capability is real. So is the gap between capability and reliable production. Both are true.
Why Mid-Level Is the Unexpected Moat
Mid-level professionals are not the next wave of AI displacement. They're the bottleneck that determines whether AI deployment actually works— the layer that translates expert judgment into agent guardrails, validates outputs against stakes, and absorbs the edge cases no model handles cleanly.
McKinsey's State of Organizations 202612 frames the shift directly. "Experts will move from being the ones everyone queues for to being the ones who encode judgment— translating tacit mastery into rules, thresholds, data, and training regimes that agents can run under supervision." Managers move from supervising tasks to orchestrating hybrid systems. Up to 30% of hours can be automated through AI agents, McKinsey projects, but the architecture of that automation runs through human judgment at every consequential edge.
Harvard Business Review's research on the "last mile" problem13 explains what blocks AI deployments from making it into production. "The AI 'last mile' is not blocked by technology but by unresolved questions of operating models, governance, and human identity." The biggest bottleneck is organizational design, not the model. And the people who absorb the complexity of operating models, governance, and edge-case judgment are mid-level professionals.
Klarna is the canonical case. The fintech replaced roughly 700 customer service workers with AI in 2024. By 2025–2026, the company had reversed course14, rebuilding human capacity into a hybrid model after AI failed on multi-step disputes, emotional escalation, and edge cases. AI handled the routine queries fine. The work that broke it— exactly the work mid-levels do— wouldn't compress into prompts.
Mid-level work is three things AI doesn't reliably do:
- Orchestrate. Coordinate workflows across teams, tools, and stakeholders with shifting priorities.
- Validate. Catch the model's plausible-but-wrong output against the real consequences of being wrong.
- Escalate. Know when something has crossed from a problem they can solve into a problem the senior partner needs to see.
A federal grant writing consultant I worked with put it cleanly. "It doesn't replace a grant writer," he told me. "The magic is when you've got someone with deep content expertise. And you pair that with AI. Neither one of those things, I think, are as strong alone." After a decade of writing federal grants, he didn't get replaced by AI. He got amplified by it— the formulaic narrative drafting and opportunity summarization moved to custom tools he built, freeing him to do the judgment work AI couldn't touch. Anthropic's own usage data15 reflects the same pattern— augmentation patterns are growing on Claude.ai, with senior workers extending what they can do rather than getting replaced.
The Talent Pipeline Death Spiral
The compounding risk is structural. If firms stop hiring entry-level workers because AI handles their tasks, the pipeline of future mid-level professionals collapses in 5 years and the senior layer collapses in 10. The roles AI complements are filled by people whose career path AI is closing.
The mechanics are simple and unforgiving:
- Year 1–3: Junior hiring drops because AI does codifiable junior work.
- Year 4–7: The mid-level layer thins because the people who would have grown into it never entered.
- Year 8–10: The senior layer cracks because the mid-levels who would have grown into it never accumulated the reps.
Stanford's data confirms the pipeline is already breaking6. Employment in AI-exposed entry-level occupations has dropped 16% relative to trend in the past two years. CNBC16 frames it bluntly: "AI is not just ending entry-level jobs. It's the end of the career ladder as we know it." Wharton's research on the same dynamic17 reinforces the point— AI is dismantling the steps that historically allowed people to build careers from execution into judgment.
"AI is dismantling the steps that historically allowed non-graduates to build white-collar careers." — Knowledge at Wharton17
The asymmetric risk is what should keep founders up. Firms that stop hiring juniors win short-term margin and lose 5-year capability. Markets clearing 5–10 years from now doesn't help your 2027 hiring problem. This is a fractional AI vs fractional CTO conversation worth having now, before the pipeline math forces it.
What AEC Firms Specifically Should Do
AEC firms face the convergence acutely. 92% of firms can't fill roles, only 27% have adopted AI, and early adopters with sustained deployment are reclaiming 500–1,000 hours per worker on critical tasks. The window is right now, and the move is to augment existing mid-levels, accelerate apprenticeship through AI-paired learning, and encode senior judgment into agent guardrails— not wait for tools to mature.
The numbers come from ASCE's December 2025 industry survey1819. Only 27% of AEC professionals currently use AI in their operations. Of the early adopters, 95% use AI frequently across the building lifecycle, and nearly half have reclaimed 500–1,000 hours on critical tasks like scheduling, planning, and document analysis. Meanwhile, 92% of firms reported difficulty filling craft and salaried positions, and 84% plan to increase technology investment in 2026.
Bluebeam's 2025 report on the same industry20 names what's actually slowing adoption. "The biggest barriers to AEC technology adoption in 2026 aren't cost," the report concludes. "They're complexity, culture and connection." That's an important distinction. Cost is a finance problem. Complexity, culture, and connection are leadership problems— and they're the ones AI implementation actually requires.
A useful three-part framework for AEC founders:
- Augment, don't replace. Pair mid-level architects, project managers, and senior PMs with AI agents that handle the codifiable work— spec parsing, document summarization, scheduling drafts, narrative writing. The senior judgment stays human. The repetitive execution moves to AI.
- Redesign apprenticeship. Use AI-paired learning to compress the apprenticeship curve from years to months. Junior staff working alongside AI agents get senior judgment encoded into the systems they use daily. This is how you preserve the pipeline while still capturing the productivity gain.
- Encode judgment. Capture senior expertise into rules, training data, and guardrails the agents run under. This is the work McKinsey describes when experts move from "queue takers" to judgment encoders12. It's also how a firm builds durable IP that doesn't walk out the door when a senior PM retires.
This is what augmenting a mid-level operator looks like in practice. A fractional COO I work with supports five companies in 30 hours a week— producing 50 pages of brand-authentic content in an hour where it used to take weeks. She isn't doing the same job faster. She's doing a job that didn't exist at her level before, because AI handles the volume work while her judgment stays where the stakes are. For AEC firms, the analog is your senior PM running three projects instead of two— once they've built the AI-paired workflows that make that possible— with AI handling the document churn that used to consume the third slot.
The choice between AI consultant vs in-house capacity is downstream of this question. Get the architecture right first, then decide who builds it.
The Architecture Decision Facing Founders
Stop treating this as a hiring question. The real question is architectural— how do you wire AI and humans together right now to determine your firm's capability surface in 2030? The decisions you make about which work AI does, which work humans do, and where the handoffs sit are decisions about what your firm can do five years from now.
Three architecture decisions every founder is making right now, whether they know it or not:
- Which work AI does. Codifiable, repeatable, low-stakes-of-being-wrong tasks. Document summarization, scheduling drafts, formulaic narrative writing, opportunity scoring.
- Which work humans do. Judgment-bearing, stake-laden, context-heavy work. Stakeholder relationships, design intent, regulatory judgment, project recovery.
- Where the handoffs sit. Orchestration (who initiates what), validation (who catches errors), escalation (who decides when to break the system).
Anthropic's data15 makes the stakes concrete. About 49% of jobs have seen at least a quarter of their tasks performed using Claude, with coding accounting for 35% of conversations on the consumer side. The architecture is being made. The only question is whether it's being made deliberately.
This is where I think about AI as intellectual augmentation rather than artificial intelligence. Same letters, different orientation. AI amplifies human capability— the architecture question is which capability. How you wire AI and humans together today determines your firm's capability surface in 2030. And the firms that get this right won't be the ones that automated the most. They'll be the ones that encoded the most judgment.
If mapping this architecture for your firm feels heavy, that's exactly what I do. Helping founder-led firms architect AI implementation is the work of Dan Cumberland Labs— peer-to-peer, AI strategy services built around your firm's specific judgment, not a packaged playbook. The patterns are knowable. The execution is bespoke. And the building AI culture part— the complexity, culture, connection part Bluebeam named— is the part most engagements underweight.
The Real Level 5 Architecture
The Level 5 architecture worth building isn't the AI's. It's your firm's— the deliberate wiring of AI and human judgment that determines what your team can do in 2030. The most advanced AI imaginable still won't hire your next mid-level. It also won't decide which work belongs with humans, which belongs with agents, and where the handoffs live.
That's still your call. Make it deliberately.
The question to bring to your next leadership meeting: which work in our firm is currently load-bearing on mid-level judgment, and how are we protecting that pipeline for 2030?
The data says AI substitutes for entry-level execution and complements senior judgment. The mid-level layer between them is what turns model capability into firm capability. And the call belongs to the founder.
FAQ
What is Level 5 architecture in AI?
Level 5 architecture refers to AI capable of doing the work of an entire organization— the highest tier of OpenAI's five-level capability framework— or, in agent-autonomy frameworks, fully autonomous multi-agent systems. OpenAI introduced the framework in July 20241, and an academic preprint3 formalized the parallel agent-autonomy framework. Today's most ambitious deployments operate at Levels 3–4 of either.
What is OpenAI's 5-level AI framework?
OpenAI's five-level framework defines progress toward AGI: Level 1 (Chatbots), Level 2 (Reasoners), Level 3 (Agents), Level 4 (Innovators), Level 5 (Organizations). The framework was shared with employees in July 2024 according to Bloomberg's reporting1. The AI Insider's breakdown2 confirms the same sequence.
What level of AI are we currently at?
OpenAI internally stated in July 2024 it was at Level 1, on the cusp of Level 2 (Reasoners)1. Frontier deployments like Cognition's Devin operate at Levels 3–4 of agent autonomy frameworks but still require senior human oversight on architectural decisions, according to Cognition's own annual review4.
Will AI replace mid-level managers?
Routine-coordination middle management roles are at risk, but judgment-bearing mid-level professionals are becoming more valuable as they orchestrate AI agents and carry the tacit knowledge AI cannot replicate. McKinsey's State of Organizations 202612 describes the shift as one where managers move from supervising tasks to orchestrating hybrid systems.
Why is AI more likely to replace entry-level than senior workers?
AI automates tasks that can be explicitly described, while senior workers carry tacit knowledge— judgment, pattern recognition, contextual awareness— that AI cannot replicate. Researchers at Stanford term this "seniority-biased technological change"9. Workers ages 22–25 in AI-exposed occupations have seen a 16% relative employment decline; older workers in the same roles have stayed stable or grown6.
What is the "last mile problem" in AI?
The "last mile" is the gap between AI capability and reliable production deployment, blocked not by technology but by operating models, governance, and human oversight requirements. Harvard Business Review13 frames it as an organizational design problem rather than a model problem— which is why mid-level professionals carrying operational context become the load-bearing layer for AI deployments to actually work.
What is the AI talent pipeline death spiral?
If firms stop hiring entry-level workers because AI handles their tasks, the pipeline of future mid-level and senior workers collapses over 5–10 years, leading to severe shortages of judgment-bearing professionals. Stanford's data6 shows the pipeline is already breaking, and Wharton's analysis17 calls it the dismantling of the steps that historically built white-collar careers.
How are AEC firms adopting AI?
Only 27% of AEC professionals currently use AI18, but early adopters report nearly half reclaiming 500–1,000 hours per worker on critical tasks. 92% of AEC firms can't fill roles19. The biggest barriers are complexity, culture, and connection— not cost, according to Bluebeam's 2025 industry report20.
References
- Bloomberg News, "OpenAI Sets Levels to Track Progress Toward Superintelligent AI" (2024) — https://www.bloomberg.com/news/articles/2024-07-11/openai-sets-levels-to-track-progress-toward-superintelligent-ai
- The AI Insider, "What Are OpenAI's Five Levels of AI – And Where Are We Now?" (2024) — https://theaiinsider.tech/2024/07/12/what-are-openais-five-levels-of-ai-and-where-are-we-now/
- arXiv academic preprint, "Levels of Autonomy for AI Agents" (2025) — https://arxiv.org/abs/2506.12469
- Cognition AI, "Devin's 2025 Performance Review: Learnings From 18 Months of Agents At Work" (2026) — https://cognition.ai/blog/devin-annual-performance-review-2025
- IBM, "Meet Devin the AI Software Engineer, Employee #1 in Goldman Sachs' 'Hybrid Workforce'" (2026) — https://www.ibm.com/think/news/goldman-sachs-first-ai-employee-devin
- Stanford Digital Economy Lab, "Canaries in the Coal Mine? Six Facts about the Recent Employment Effects of Artificial Intelligence" (2025) — https://digitaleconomy.stanford.edu/publication/canaries-in-the-coal-mine-six-facts-about-the-recent-employment-effects-of-artificial-intelligence/
- Stanford Digital Economy Lab, "Canaries in the Coal Mine?" working paper PDF (2025) — https://digitaleconomy.stanford.edu/wp-content/uploads/2025/11/CanariesintheCoalMine_Nov25.pdf
- Stanford Digital Economy Lab, "Canaries in the Coal Mine?" working paper PDF (2025) — https://digitaleconomy.stanford.edu/wp-content/uploads/2025/11/CanariesintheCoalMine_Nov25.pdf
- Brynjolfsson, Chandar, and Chen, "Generative AI as Seniority-Biased Technological Change: Evidence from U.S. Résumé and Job Posting Data" (2025) — https://www.alejandrobarros.com/wp-content/uploads/2025/11/ssrn-5425555.pdf
- Fortune, "Microsoft AI Chief Gives It 18 Months — For All White-Collar Work to Be Automated by AI" (2026) — https://fortune.com/2026/02/13/when-will-ai-kill-white-collar-office-jobs-18-months-microsoft-mustafa-suleyman/
- Fortune, "First-of-its-kind Stanford study says AI is starting to have a 'significant and disproportionate impact' on entry-level workers in the U.S." (2025) — https://fortune.com/2025/08/26/stanford-ai-entry-level-jobs-gen-z-erik-brynjolfsson/
- McKinsey & Company, "The State of Organizations 2026: Three Tectonic Forces That Are Reshaping Organizations" (2026) — https://www.mckinsey.com/capabilities/people-and-organizational-performance/our-insights/the-state-of-organizations
- Harvard Business Review, "The 'Last Mile' Problem Slowing AI Transformation" (2026) — https://hbr.org/2026/03/the-last-mile-problem-slowing-ai-transformation
- DigitalApplied, "Klarna Reverses AI Layoffs: Why Replacing 700 Failed" (2026) — https://www.digitalapplied.com/blog/klarna-reverses-ai-layoffs-replacing-700-workers-backfired
- Anthropic, "Anthropic Economic Index Report: Learning Curves" (2026) — https://www.anthropic.com/research/economic-index-march-2026-report
- CNBC, "AI is not just ending entry-level jobs. It's the end of the career ladder as we know it" (2025) — https://www.cnbc.com/2025/09/07/ai-entry-level-jobs-hiring-careers.html
- Knowledge at Wharton, "Is AI Pushing Us to Break the Talent Pipeline?" (2026) — https://knowledge.wharton.upenn.edu/article/is-ai-pushing-us-to-break-the-talent-pipeline/
- ASCE (American Society of Civil Engineers), "Architecture, Engineering, Construction Sector Slow to Adopt AI, Survey Shows" (2025) — https://www.asce.org/publications-and-news/civil-engineering-source/article/2025/12/18/architecture-engineering-construction-sector-slow-to-adapt-ai-survey-shows
- ASCE / Bluebeam survey data, "Architecture, Engineering, Construction Sector Slow to Adopt AI, Survey Shows" (2025) — https://www.asce.org/publications-and-news/civil-engineering-source/article/2025/12/18/architecture-engineering-construction-sector-slow-to-adapt-ai-survey-shows
- Bluebeam, "New Bluebeam Report Shows Early AI Adopters in AEC Seeing Significant ROI Despite Uneven Adoption" (2025) — https://press.bluebeam.com/2025/10/new-bluebeam-report-shows-early-ai-adopters-in-aec-seeing-significant-roi-despite-uneven-adoption/