I currently think we’re mostly interested in properties that apply at all timesteps, or at least “quickly”, as well as in the limit; rather than only in the limit. I also think it may be easier to get a limit at all by first showing quickness, in this case, but not at all sure of that.
The actual hard parts? Math probably doesn’t help much directly, unfortunately. Mathematical thinking is good. You’ll have to learn how to think in novel ways, so there’s not even a vector anyone can point you in, except for pointers with a whole lot of “dereference not included” like “figure out how to understand the fundamental forces involved in what actually determines what a mind ends up trying to do long term” (https://tsvibt.blogspot.com/2023/04/fundamental-question-what-determines.html).
if someone who’s v good at math wants to do some agent foundations stuff to directly tackle the hard part of alignement, what should they do?
If they’re talented, look for a way to search over search processes without incurring the unbounded loss that would result by default.
If they’re educated, skim the existing MIRI work and see if any results can be stolen from their own field.
I currently think we’re mostly interested in properties that apply at all timesteps, or at least “quickly”, as well as in the limit; rather than only in the limit. I also think it may be easier to get a limit at all by first showing quickness, in this case, but not at all sure of that.
The actual hard parts? Math probably doesn’t help much directly, unfortunately. Mathematical thinking is good. You’ll have to learn how to think in novel ways, so there’s not even a vector anyone can point you in, except for pointers with a whole lot of “dereference not included” like “figure out how to understand the fundamental forces involved in what actually determines what a mind ends up trying to do long term” (https://tsvibt.blogspot.com/2023/04/fundamental-question-what-determines.html).
Some of the problems: https://tsvibt.blogspot.com/2023/03/the-fraught-voyage-of-aligned-novelty.html A meta-philosophy discussion of what might work: https://tsvibt.blogspot.com/2023/09/a-hermeneutic-net-for-agency.html
If you are capable of meaningfully pushing capabilities forward and doing literally anything else, that’s already pretty helpful.