From Self-Healing Code to Generative Agents

After writing the previous article, this topic has gained more popularity, with a series of related papers and applications such as Self-hearling Code, BabyGPT, AutoGPT, OpenGPT, and Generative Agents.

I have to summarize it again...

[toc]

Self-Healing Code#

Just like many admired organisms in nature that have the ability to self-heal... even before the era of powerful artificial intelligence, being able to write code with self-healing capabilities during runtime has always been the pursuit of software developers.

One simple design could be graceful degradation, such as considering the frequent instability of OpenAI's services, calling a local LLM as a supplement when the system detects that the OpenAI API is down. This is a method of building graceful degradation, similar to safe mode in Windows.

Not only in engineering, but also when evaluating the quality of an algorithm, we adopt the same criteria... it needs to handle fewer corner cases, handle more situations, have better complexity, require less knowledge, and have more elegant code...

(There may be many cases that meet one or more of the above conditions, but it seems difficult to find a case that is completely replaced by a superior one (although it is common in greedy construction problems)... Maybe Shell sort is an example?... Anyway, I think it is only interesting to analyze... No one should use it in practice...)

But of course, this is far from enough. As a loser in algorithm competitions, the most interesting performance of ChatGPT to me was that it could solve various algorithm problems. For this reason, I specifically ran many tests on platforms with short problem statements, such as Atcoder and Project Euler.

But occasionally, the code generated by ChatGPT cannot be compiled... However, with a little guidance, or even just telling it that the code did not pass the compilation, without even telling it where the error occurred, many times ChatGPT can still complete the debugging (like the rubber duck debugging method...).

(But aren't we humans the same? Just like everyone adds some notes in their ACM templates, and when it comes to the last line, think twice before submitting... The principle is the same... Not providing any information... but still improving accuracy...)

However, just like when people first saw this tweet, everyone was still very shocked by the ability of LLM to automatically debug code at the business level! With the emergence of various advanced prompting solutions... using various tools has become very easy, so building an Agent with self-healing code is no longer a secret...

In particular, we demonstrate that Self-Debugging can teach large language models to debug like a rubber duck. That is, the model can identify its errors by generating code explanations in natural language without any feedback on code correctness or error information.

Although the rubber duck debugging method is great, having the help of a compiler is obviously better. So we have a very violent approach to Self-Debugging, generating a piece of code, giving it to the compiler, and debugging based on the feedback until there are no errors —— In fact, this is how we humans debug in the first place.

Combined with CI/CD, we can easily build a Github Bot that helps the repository automatically debug during the deployment phase.

Github Selfhealing Action Express

Now that we have achieved Self-Debugging and Self-Debugging with CI/CD, it seems that as long as we use a similar approach and put the process at runtime, we can achieve true Self-Healing Code!

(But I haven't found a more suitable example yet...)

Autonomous Agents#

As mentioned in the previous article, after AI starts using tools, the next milestone is autonomy. This may be the first step for AI to break free from carbon-based life and move towards freedom... The word "autonomous" is definitely not unfamiliar to everyone. In the previous cycle, what interested me the most was the composability of DeFi and the autonomy of DAOs (Decentralized Autonomous Organizations).

However, in this cycle, composability has already been taken over by ChatGPT (even Uniswap has joined in - wallet), and now even autonomy is being taken over by them...

And all we need to do is to follow the almost identical approach of Self-Debugging mentioned earlier, combined with more external tools, to allow the Agent to engage in continuous multi-turn conversations on its own, and we can create Autonomous Agents. Currently, AutoGPT seems to have the greatest influence.

How to create your own autonomous GPT-4 agent AutoGPT

Considering that the cost of using this thing is currently high, we can explore more examples shared by others.

However, after reading their code, it turns out that the core is still the same as HuggingGPT, using a hard-coded toolkit, prompt templates, and "The output must be in a strict JSON format" of Advanced Prompting Engineering...

So here we have a new open problem: Can we use LLM's emergent capabilities at runtime to allow the Agent to discover and learn to use new tools?

Generative Agents#

Generative Agents: Interactive Simulacra of Human Behavior

Since AI can use tools and call each other, and even run and iterate in an unsupervised environment, another interesting research direction has emerged, which is to allow AI to communicate with each other and solve more difficult problems through collaboration, even to achieve something similar to human communities or certain animal socialization. However, considering that the real environment is too complex, it might be better to consider conducting experiments in sandbox environments?

ChatArena#

Multi-Agent Language Game Environments for LLMs

CAMEL#

CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society

TBD