Self-Improving Agents & Knowledge Graphs: The Experimental Flywheel
I wrote a LinkedIn post a couple of weeks ago about the Chinese robotic marathon with a funny GIF and a very short message. It outperformed every piece of content that Cici wrote by 5X. That’s a great data point, but what should we do about it?
Over the last 2 weeks, I took Cici offline and built content manually again. I got 6X the impressions, and course sales went back to the pre-Cici baseline. Again, that’s great data, but why is my content more effective than Cici’s?
We want to improve content performance, and we have obviously found something that worked. How do we learn and improve from our success? It all starts with the experiment. In a prior article, I explained the architecture of self-improving knowledge graphs. This article gets deeper into the implementation of the experimental flywheel.
Self-improvement comes in three phases.
What went wrong or, in this case, right? We need diagnostic capabilities. Why is my content better than Cici’s?
Knowing what went wrong or right, what should we do about it? We need prescriptive capabilities to know what actions are available to us.
Given multiple options for our next action, which option is most likely to deliver the biggest impact on sales given our constraints and optimizations? We need predictive capabilities.
Bad news: LLMs are terrible at all three, so we need something to take their place in the agentic workflow. Remember, this all looks feasible with human effort because we are looking at a very small workflow happening on a small scale. Ramp this up, and people cannot keep up.
We could train predictive, prescriptive, and diagnostic models, but what if we lack the data to do it? That is the reality of most businesses, especially startups. They have enough data to train descriptive models, but we need something much more reliable.
More bad news: Our knowledge graph is not ready to tackle the tough questions about why my content is more effective than Cici’s. We need to fill in foundational parts of the knowledge graph before it can reveal why what I do is better. But how do we find those gaps?
In the last article in this series, I explained how to build a simple knowledge graph from a relational database. I am assuming that you are starting with something similar.
In this article, I will cover basic experimental loops and how the math from my last article supports them. This is a simple version of experimental cycles that I will use to explain how to implement an experimental flywheel. These experiments have very low complexity, and they are the best place to build our new capabilities.


