It might be that the ‘goals’ part of what nostalgebraist is waving at is separable from the temporal coherence part.
I mean this in the sense that GPT-3 doesn’t have goals; obviously you can debate that one way or another, but consider all the attempts to make a long document transformer. The generative models have two kinds of memory, essentially: the weights, analogous to long term memory, and the context window, analogous to working memory. There either needs to be some kind of continuous training/fine-tuning of the weights in production, or it needs a third form of episodic memory where the AI can remember the context (“War with Eurasia ended, we have always been at war with Eastasia”). These improvements could make GPT-3 able to write a coherent novel, plausibly without making it any more agentic.
My opinion is that you’re not going to be able to crack the alignment problem if you have a phobia of infohazards. Essentially you need a ‘Scout Mindset’. There’s already smart people working hard on the problem, including in public such as on podcasts, so realistically the best (or worst) could do on this forum is attempt to parse out what is known publicly about the scary stuff (eg agency) from DeepMind’s papers and then figure out if there is a path forward towards alignment.