Software Development

The Shifting Landscape of Software Engineering: From Creation to Supervision in the Age of AI

The rapid integration of Artificial Intelligence into software development workflows is fundamentally altering the role of the professional software engineer, moving their focus from direct code creation towards a more supervisory capacity. This profound shift, according to research and expert analysis, necessitates a redefinition of skills, practices, and ultimately, the very essence of the profession.

The Emergence of the "Middle Loop"

Annie Vella’s comprehensive research, which surveyed 158 professional software engineers on their use of AI, identified a significant transformation in how engineers allocate their time and effort. Vella’s central inquiry revolved around whether AI tools were indeed reshaping the landscape of engineering work. Her findings suggest a clear pivot: "Are AI tools shifting where engineers actually spend their time and effort? Because if they are, they’re implicitly shifting what skills we practice and, ultimately, the definition of the role itself."

Vella observed a discernible movement away from creation-oriented tasks towards verification. However, this new form of verification differs from traditional code review and testing. She has proposed a new term for this evolving role: "supervisory engineering work." This encompasses the efforts required to effectively direct AI systems, meticulously evaluate their outputs, and diligently correct any errors or inaccuracies.

The traditional software development process is often conceptualized through "inner and outer loops." The inner loop involves the immediate act of coding, testing, and debugging. The outer loop encompasses the broader development lifecycle, including committing code, undergoing reviews, continuous integration and continuous deployment (CI/CD) pipelines, deployment, and system observation. Vella posits that supervisory engineering work occupies a novel space between these established loops, forming what she terms the "middle loop."

"What if supervisory engineering work lives in a new loop between these two loops?" Vella asks. "AI is increasingly automating the inner loop – the code generation, the build-test cycle, the debugging. But someone still has to direct that work, evaluate the output, and correct what’s wrong. That feels like a new loop, the middle loop, a layer where engineers supervise AI doing what they used to do by hand."

While Vella’s research concluded in April 2025, preceding the most recent advancements in AI models specifically tailored for software development, the author suggests that these improvements have only accelerated this shift towards supervisory engineering. This transformation, while not signaling "the end of programming," represents a significant evolution in what it means to be a programmer.

Navigating Uncertainty and Redefining Value

The current environment is characterized by a palpable sense of uncertainty among software engineers. Skills honed over years of dedicated practice are being commoditized, leading to anxiety about career trajectories. Popular narratives often present a dichotomy: either AI will replace jobs entirely, or engineers must "move upstream" into architecture and "higher value" roles. Neither narrative offers practical guidance for the day-to-day realities faced by engineers.

"A lot of software engineers right now are feeling genuine uncertainty about the future of their careers," Vella notes. "What they trained to do, what they spent years upskilling in, is shifting – and in many ways, being commoditised. The narratives don’t help: either AI is coming for your job, or you should just ‘move upstream’ into architecture and ‘higher value’ work. Neither tells you what to actually do on Monday morning."

Vella’s framework of supervisory engineering work and the middle loop offers a grounded perspective on the evolving nature of the profession, providing a descriptive language for the changes engineers are actively experiencing. This underscores the continued relevance and necessity of engineering expertise, albeit in a transformed capacity.

Frameworks for Agentic Engineering

The increasing capabilities of AI in coding have outpaced human proficiency in wielding these tools effectively. This disparity is evident when high scores on AI coding benchmarks do not translate into tangible productivity gains for engineering teams. Bassim Eledath, in his work on "8 levels of Agentic Engineering," highlights this gap.

"AI’s coding ability is outpacing our ability to wield it effectively," Eledath observes. "That’s why all the SWE-bench score maxing isn’t syncing with the productivity metrics engineering leadership actually cares about. When Anthropic’s team ships a product like Cowork in 10 days and another team can’t move past a broken POC using the same models, the difference is that one team has closed the gap between capability and practice and the other hasn’t."

Eledath’s model proposes that this gap is bridged incrementally, through eight distinct levels of development. While the exact details of Eledath’s eight levels are not fully enumerated in the provided text, the concept of tiered progression in AI adoption and proficiency is central.

This focus on multi-level frameworks is not unique. Earlier this year, Steve Yegge proposed his own eight-level model in "Welcome to Gas Town." While the specifics of Yegge’s levels also remain outside the direct scope of this report, the recurrence of an eight-level structure suggests a shared perception of a phased evolution in how humans and AI interact within the engineering domain. These "Maturity Models," as described by Martin Fowler, provide valuable lenses through which to understand the varied approaches to LLM usage and highlight the divergent paths engineers are taking.

Rethinking Code Generation: The Constraint of Replacement

Chad Fowler argues for a fundamental shift in how we conceptualize code generation. In an era where AI can produce code rapidly and efficiently, the primary constraint is no longer the creation of code itself, but its safe and reliable replacement.

"In a world where code can be generated quickly and cheaply, the real constraint has shifted," Fowler states. "The problem is no longer producing code. The problem is replacing it safely."

He further elaborates that "regenerative software does not work if the unit of generation is an application. Regeneration only works if the unit of generation is a component that compiles into a system architecture." This implies a need for architectural designs that facilitate the modular replacement of components, a long-standing goal in software architecture that remains critically important in the context of agentic engineering. Fowler outlines several architectural constraints that simplify this process, emphasizing the value of designing systems as networks of interchangeable parts.

The Pedagogy of AI Integration

The challenges of integrating AI extend beyond technical considerations to encompass pedagogical ones. Mike Masnick’s summary of troubling experiences with AI detection systems in academic settings offers a cautionary tale. Dadland Maye’s article, as summarized by Masnick, details how institutions implemented AI detection tools on student writing.

The consequence of such measures was a perverse incentive: "We are teaching an entire generation of students that the goal of writing is to sound sufficiently unremarkable! Not to express an original thought, develop an argument, find your voice, or communicate with clarity and power—but to produce text bland enough that a statistical model doesn’t flag it," Maye observed.

A more constructive outcome emerged when Maye ceased mandating AI usage disclosures. This shift redirected the focus from a problem of detection to one of effective utilization. Students began proactively seeking guidance on how to best leverage AI tools for their work, asking questions like how to prompt for research without direct copying or how to discern when an AI-generated summary deviates from its source material. These interactions fostered pedagogical discussions and highlighted the potential for AI to become a subject of instruction rather than a mere disclosure issue.

The imperative now is to educate individuals on how to employ AI tools to enhance their professional output. However, the nascent stage of these technologies means a scarcity of individuals with extensive experience in their optimal application. For seasoned professionals, this period represents a fascinating societal reaction to emerging technology, yet it offers little immediate solace to those charting their future careers.

The Evolving Role of Code Review

Ankit Jain proposes a radical re-evaluation of code review practices, suggesting that humans may not only be relieved of writing code but also of reviewing it. He points to the existing inefficiencies in human-led code review processes, even before the widespread adoption of AI.

"Humans already couldn’t keep up with code review when humans wrote code at human speed," Jain argues. "Every engineering org I’ve talked to has the same dirty secret: PRs sitting for days, rubber-stamp approvals, and reviewers skimming 500-line diffs because they have their own work to do."

Jain advocates for a transition to layered evaluation filters, where AI systems take on increasingly sophisticated roles in code assessment. While the specifics of these layers are not detailed here, the implication is a move towards automated, scalable, and potentially more consistent code validation.

There remains an underlying unease for some, including Birgitta, regarding the notion that "the code doesn’t matter." The sentiment is that well-crafted code serves as a precise and clear articulation of an engineer’s intent, often making direct code modification more efficient than attempting to explain changes to an AI. The pursuit of a precise and understandable code representation remains a valuable objective, with agentic AI potentially serving as a powerful tool to achieve this.

However, the proposition of relying on natural language for specifications, such as Behavior-Driven Development (BDD) tests, is met with skepticism. Such specifications can be verbose and ambiguous. While tests are indeed crucial for understanding system behavior, and the future of AI might emphasize testing over direct implementation, the format of these tests requires careful consideration.

The New Paradigm: Serving the Agents

The evolving relationship between humans and AI in software engineering is prompting reflections on leadership and operational models. Jessica Kerr’s observation, "The new servant leadership: we serve the agents by telling what to do 9/9/6," humorously encapsulates a potential future where human effort is directed towards guiding and managing AI systems, a stark contrast to traditional hierarchical structures. This highlights a fundamental recalibration of roles, where human input becomes focused on strategic direction and oversight rather than granular execution.

The implications of these shifts are far-reaching, demanding adaptability from individuals and institutions alike. As AI continues its inexorable advance, the software engineering profession stands at a critical juncture, poised to redefine its purpose and its practitioners’ indispensable contributions. The path forward lies not in resisting these changes, but in understanding, adapting, and ultimately mastering the new landscape of supervisory engineering.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button
Lock It Soft
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.