I completely agree. This is why a large part of alignment research should be, and sometimes is, predicting new capabilities work.
Aligning AGI successfully requires having a practical alignment approach that applies to the type of AGI that people actually develop. Coming up with new types of AGI that are easier to align is pointless unless you can convince people to actually develop that type of AGI fast enough to make it relevant, or convince everyone to not develop AGI we don’t know how to align. So far, people haven’t succeeded at doing that sort of convincing, and I haven’t even really seen alignment people try to do that practical work.
AutoGPT is an interesting bare-bones version, but I think the strongest agents will likely be more complex than that.
Palantir developed techniques of how to let the model interact with the rest of the data of an enterprise. Understanding how models interact with what Palantir calls ontology is likely valuable.
That post I linked includes a bunch of thinking about how language model agents will be made into more complex cognitive architectures and thereby more capable.
I completely agree. This is why a large part of alignment research should be, and sometimes is, predicting new capabilities work.
Aligning AGI successfully requires having a practical alignment approach that applies to the type of AGI that people actually develop. Coming up with new types of AGI that are easier to align is pointless unless you can convince people to actually develop that type of AGI fast enough to make it relevant, or convince everyone to not develop AGI we don’t know how to align. So far, people haven’t succeeded at doing that sort of convincing, and I haven’t even really seen alignment people try to do that practical work.
This is why I’ve switched my focus to language model agents like AutoGPT. They seem like the most likely path to AGI at this point.
AutoGPT is an interesting bare-bones version, but I think the strongest agents will likely be more complex than that.
Palantir developed techniques of how to let the model interact with the rest of the data of an enterprise. Understanding how models interact with what Palantir calls ontology is likely valuable.
That post I linked includes a bunch of thinking about how language model agents will be made into more complex cognitive architectures and thereby more capable.