Studying BTech+MTech at IIT Delhi. Check out my comment (and post) history for topics of interest. Please prefer seeing recent comments over older ones as my thoughts have been updating very frequently recently.
Discovered EA in September 2021, was involved in cryptocurrency before that, graduating 2023
profile: samueldashadrach.github.io/EA/
Last updated: June 2022
I’m not sure I follow.
Just to clarify, do you mean “most AI systems that don’t initially have coherent preferences, will eventually self-modify / evolve / follow some other process and become agents with coherent preferences”?
If yes, I would be keen on some evidence or arguments.
Because in my ontology, things like “without worrying about goodharting” and “most efficient way of handling that is with strong optimization …” comes after you have coherent preferences, not before.