ARC works on theoretical alignment, but I think it’s reasonably clear how ARC’s work fits in with the current LLM paradigm (just apply our methods to LLMs!). It’s just an ambitious foundational project that has a very different risk/reward profile from more incremental work. More incremental work naturally looks more and more exciting over time as AI improves (since the real-world problems get closer ad closer to your long-run concerns).
There was an assumption 5 years ago that AGI requires fundamental insights into intelligence.
I think a lot of people who have been working on theoretical alignment have long believed that LLMs could scale to AGI. Here’s me from 10 years ago:
It now seems possible that we could build “prosaic” AGI, which can replicate human behavior but doesn’t involve qualitatively new ideas about “how intelligence works:”
It’s plausible that a large neural network can replicate “fast” human cognition, and that by coupling it to simple computational mechanisms — short and long-term memory, attention, etc. — we could obtain a human-level computational architecture.
It’s plausible that a variant of RL can train this architecture to actually implement human-level cognition. This would likely involve some combination of ingredients like model-based RL, imitation learning, or hierarchical RL.
[...]
I think that prosaic AGI should probably be the largest focus of current research on alignment.
I believe that all the issues you raised have been fixed, let us know if that’s not the case.