First, we need structure. There are not two but three selves! Now, Then, and Later. Three challenges ensue. It’s like this:
Now you care so much or so little about Then & Later, this cannot be changed, no point discussing
Sadly, Then is a greedy short-termie—just as Now may be, but as a Now you don’t care about Now’s own misalignment—and you as Now want this short-termie to behave kindly towards Later, as you don’t give more of a f* about Then than about Later.
To constrain Then (and—depending on the degree of altruism towards the rest of the world rather than only on towards your terminal Later—maybe also Later), you have one or two aims:
Align Then’s behavior towards Later: Hand over bank account access towards an incorruptable physical person; let them allow you to buy you only healthy and cheap things? New apps try to help; punishing you if you don’t stick to your commitments. Overlord. But overall this is an even-more-than-necessary tricky one as I think our liberal spirits fail to see that, on a societal level helping Now to restrict Then in support of Later, which I advocate for here, is an increase of freedom rather than a restriction; seems to be a tabu that even addicts themselves fail to recognize as a key issue; it’s learned helplessness that blinds us against possible solutions we could try out.
Align Then’s behavior w.r.t. your other-regarding preferences: Donation commitment? Non-consumption commitment? Maybe there are legal forms that allow to constrain your consumption (foundations? charities? donation lotteries with delay?)
Align Then’s preferences: Hm. Maybe meditate with loving kindness (towards yourself and/or others, depending on the specifics of your alignment worries)? Train in fasting/train resistance towards pain/deprivation/frugal lifestyle?
I’m making fun but think it’s also a bit true, hope your question was also not only meant too sternly :).
so first i figured out my goals (theres a lot of them but they all arise from “fundamental human goals” like love and survival and happiness, but the strongest one is love)
then i made Megumin which explodes other goals which dont align with my goals if they appear
this is very easy and can easily be replicated thus solving alignment
china invading taiwan would buy us at least a few years, food for thought
https://www.lesswrong.com/posts/Tr7tAyt5zZpdTwTQK/the-solomonoff-prior-is-malign im so confused
going to sleep
ok so what prior do i use
how to solve it: https://math.hawaii.edu/home/pdf/putnam/PolyaHowToSolveIt.pdf
and by it,, lets justr say,,, everything
is there any way to check my join date
also i just created a second account by accident is that allowed
i suppose ill just use it to keep my regular username
Hovering over the “7y” on your profile shows July 5 2018 in a tooltip for me (timezones may vary).
what if theres already a superintelligence and we’re all gonna die in a few years (or less)
help how do i align my future self with my current self
i need to sleep but if i sleep then what if my future self is misaligned
what if you don’t sleep and your future self is misaligned anyway?
First, we need structure. There are not two but three selves! Now, Then, and Later. Three challenges ensue. It’s like this:
Now you care so much or so little about Then & Later, this cannot be changed, no point discussing
Sadly, Then is a greedy short-termie—just as Now may be, but as a Now you don’t care about Now’s own misalignment—and you as Now want this short-termie to behave kindly towards Later, as you don’t give more of a f* about Then than about Later.
To constrain Then (and—depending on the degree of altruism towards the rest of the world rather than only on towards your terminal Later—maybe also Later), you have one or two aims:
Align Then’s behavior towards Later: Hand over bank account access towards an incorruptable physical person; let them allow you to buy you only healthy and cheap things? New apps try to help; punishing you if you don’t stick to your commitments. Overlord. But overall this is an even-more-than-necessary tricky one as I think our liberal spirits fail to see that, on a societal level helping Now to restrict Then in support of Later, which I advocate for here, is an increase of freedom rather than a restriction; seems to be a tabu that even addicts themselves fail to recognize as a key issue; it’s learned helplessness that blinds us against possible solutions we could try out.
Align Then’s behavior w.r.t. your other-regarding preferences: Donation commitment? Non-consumption commitment? Maybe there are legal forms that allow to constrain your consumption (foundations? charities? donation lotteries with delay?)
Align Then’s preferences: Hm. Maybe meditate with loving kindness (towards yourself and/or others, depending on the specifics of your alignment worries)? Train in fasting/train resistance towards pain/deprivation/frugal lifestyle?
I’m making fun but think it’s also a bit true, hope your question was also not only meant too sternly :).
ok i solved it
so first i figured out my goals (theres a lot of them but they all arise from “fundamental human goals” like love and survival and happiness, but the strongest one is love)
then i made Megumin which explodes other goals which dont align with my goals if they appear
this is very easy and can easily be replicated thus solving alignment
/hj
has anyone tried training ai to be rational and if so did it result in it being more effective
redacted for privacy concerns
redacted for privacy concerns
Not sure if this is the place that can provide the help you need.