Living Software
Exploring the concept of self-improving software products and agents
"Living Software" refers to software built, operated, and monitored by AI in a continuous loop of observation, learning, and self-improvement. In the future, AI agents will be deeply integrated into our software, constantly observing, personalizing, and improving the products we interact with—continuously iterating and fixing issues as they arise.
To enable this self-improving, living software, the AI should be capable of:
- Modifying the software's codebase
- Monitoring and observing the running software, as well as assessing how any changes it makes to the codebase affect the software's behavior, enabling it to debug and self-improve the system
I believe we’ll see the emergence of a new category of analytics and logging software primarily used by AI to understand how software runs and how it's used. These analytics products will be more closely integrated with the source code powering them, and I believe analytics tools & source control platforms will merge over time. AI needs to know how software runs to incorporate learnings about the running software back into the static codebase. This allows AI systems to understand how source code changes impact the system and user behavior. The AI should be able to monitor the running software, study its usage patterns, and analyze data flow through the system. Analytics on the running software can guide the AI's decisions on codebase modifications to enhance the product and resolve emerging issues.
One way to think about this new type of “living software” is that codebases and analytics will become tightly integrated, enabling AI feedback and debugging loops. AI agents should be able to use these new analytics tools to improve both the software they're working on and potentially their own capabilities.
Living software is dynamic: it is self-improving, self-healing, self-adapting, self-assembly, and self-directing. Living software is a cybernetic system, which is just a fancy way of saying it’s a system that has a built-in feedback loop for deciding what to do next.
This lens on software can be applied both to what we traditionally think of as SaaS software, and also to the emerging category of AI Agents as software.
Humans in the loop
Until now, humans have been essential in monitoring how software is used. They've had to observe the running application, noting its errors and user behavior. Based on these observations, they then modify the source code to improve the product.
In the next phase, AI will not only assist in coding by modifying the static source but also observe the running processes of the software, and its own running processes. It will analyze how the software is commonly used and how it operates, using this information to guide modifications and adaptations to the codebase. This shift towards living, introspective software will be a radical shift from our current paradigm of software.
How we build software today
Let’s think about the way we build software today, and contrast it with what’s to come.
Let’s consider the process of building, and operating a SaaS web application. Let’s say we’re building a nutrition app. We first pick a language and web framework. Let’s say we decide to use React on the front end with some web framework like NextJS - a common web stack in 2024. We decide we want to use some hosted version control system like Github so that we can version and store our code somewhere we trust and can collaborate with others. This is our codebase. It’s static: a bunch of text files that need to be compiled and run somewhere in order to become the running application.
In terms of the current developer workflow, we’d likely typically iterate on our codebase by running a local instance of the service (e.g. http://localhost:3000
), and trying out this local application in our browser. We’d have a workflow where we iterate on changes and bug fixes locally, watching our terminal, and using the browser to check for changes and issues with the application.
Eventually, we would deploy the code to some server that runs our software, in our case, given it’s a NextJS application, we might opt for an infrastructure provider like Vercel to host our code.
Now that our code is running on a production server, we rely on analytics, monitoring and logging tools to let us know what’s happening in our running software.
Are users hitting errors? Are some parts slow? Do our customers understand how to use the product?
In order to gather these insights, we might use a variety of tools:
- Analytics services. PostHog, Amplitude, Heap and Fullstory help us to understand how users use our product. We record and watch user sessions. We log events that correspond to actions in the product that help us understand the user journeys.
- Logging & application monitoring services. Services like Sentry and Datadog help us to glean insights into performance or code issues.
- Actually using the product - user & automated testing services. Whenever we ship a new feature, we might try to open up our browser and try our web application by clicking through common user flows to make sure everything runs smoothly, and new features work as expected. As our project grows in complexity, we may also try to automate some of these manual tests using browser automation frameworks like Playwright or Puppeteer that allow us to simulate how humans interact with the common flows in our product.
To summarize today’s software loop:
- Humans gather insights from today’s running applications by: testing the software, using monitoring, analytics and logging services
- Modifying the code to improve the product and respond to issues.
To get to living software we need AI that can leverage these same techniques: AI will observe and interact with the running product to see how it’s being used and how it works and then in order to take those learnings from observing the running software into improvements in the codebase, creating an end-to-end AI software development loop.
AI Engineering Phase 1 - AI that helps you code what you want
Over the past year few years, with Github Copilot and Cursor we entered the first phase of AI engineering: AI that helps us implement the features bug fixes we ask it for. Humans are still the ones picking what to work on, and watching the running software and broader market to decide what to implement next. LLMs help in the execution once the plan is set.
AI Engineering Phase 2 - AI that figures out what it should code
In the next phase of AI engineering and AI agents, the AI will also monitor the running software, and how it’s being used, in order to decide what to work on next.
I believe the next iterations of Integrated Development Environments (IDEs), version control systems like GitHub, and AI agents such as GitHub Copilot will integrate more closely with analytics, logging, browser automation, and API testing tools. Alternatively, analytics tools may be restructured for AI consumption, enabling AI agents to interact with these services and determine what to tackle next.
Another framing of all this is that the boundary between the application and the AI agent will blur. We can envision new frameworks and editing experiences where the AI agent building the software and the software itself become nearly indistinguishable. Today, we have human teams that build, ship, and monitor software—they're the "agents." In the next era, we'll have autonomous organizations of AI agents deeply integrated with the software they're building, further blurring the line between builders and their creations.
For a web app, we're likely to see an AI agent that:
- Manages the static codebase and deploys changes to the production environment
- Processes data streams from analytics, logging, and other sources about the running software to determine which features and bug fixes to implement next
- Uses a browser or simulator to test the software and verify that new features work as expected—an essential part of enabling agentic debug loops in SaaS software
- Recalls past software issues and learns from history, much like a human software engineer: "We encountered this error before and fixed it by doing X," or "We tried implementing this feature using another approach but ran into issues when integrating it with that service."
- Builds a comprehensive mental model encompassing the codebase, operational environment, context, user objectives, history, and how all these elements interrelate
“Know thyself”: What sentient software means for AI agents
Given that AI agents themselves are software, we can apply this same idea of an observational self-referential loop where AI agents are able to watch how they perform, and then use those analytics and understanding to adapt their own software and strategies based on their observed outcomes.
There’s an analytics package to be built for AI agents, where the analytics can then be easily streamed into the AI agents knowledge and goal direction. The AI agent can then use the information from its past runs to adapt its own software and future strategies. AI agents could also publish and share strategies they’ve tried before, so that they can all learn from one another’s interactions with the environment.
Each AI agent can note what worked and what didn’t, and also modify its software and the tools it has access to in order to be able to more effectively approach the same problem in future iterations. The idea of sentient software applied to AI agents feels like the sort of self-improvement take off loop that has potentially dangerous implications - this is especially true when we think of a society of AIs learning together and exchanging insights about strategies to try, and how to self-modify their code.
Note, that this process looks a lot like our own human introspection and social interaction. We try things in the real world, we observe what worked and what didn’t, we note it for the future, we learn from others’ experience. We also reflect on what capabilities and tools we should develop in order to be more successful in our future interactions with the environment - we are introspective tool builders that constantly adapt. We modify our own approaches and behavior, and the better our ability to introspect is, the better we are at becoming more well suited to our own environments. Self-aware, and evolutionary could be two other words we use to describe this new frontier of software.
I can’t help but think back to Douglas Hofstaeder’s book “I am a strange loop” whenever I am contemplating AI agents. The quote on the wikipedia page is a good summary:
In the end, we are self-perceiving, self-inventing, locked-in mirages that are little miracles of self-reference. — Douglas Hofstadter, I Am a Strange Loop, p. 363
[In the book] he demonstrates how the properties of self-referential systems, demonstrated most famously in Gödel's incompleteness theorems, can be used to describe the unique properties of minds
These ideas of self-improving tool-building loops, and introspective software are both really exciting, and should give us some pause. At the very least, it’s hard to see how what we understand to be software doesn’t fundamentally change in the next decade (possibly much much sooner if these loops come into effect).