iamthouthouarti

Karma: 117

iamthouthouarti 9 Apr 2026 1:29 UTC
1 point
0
in reply to: Vaniver’s comment on: Hedging and Survival-Weighted Planning
This is becoming less and less about the actual OP, but I really do still want to ask—do you think it is a near-certainty though? (Like >99% chance of AI killing us all soon I mean)

iamthouthouarti 9 Apr 2026 0:12 UTC
1 point
0
in reply to: Vaniver’s comment on: Hedging and Survival-Weighted Planning
Is that purely because they think AI-driven-extinction is almost certain or is it a combination of that and “even if we survive we probably won’t need retirement money anyway”?

iamthouthouarti 8 Apr 2026 19:20 UTC
2 points
0
in reply to: lc’s comment on: lc’s Shortform
Let’s hope that continues

iamthouthouarti 8 Apr 2026 18:44 UTC
1 point
0
in reply to: paulfchristiano’s comment on: Daniel Kokotajlo’s Shortform
Are you at all worried about whether Claude Mythos being accidentally trained against CoT will corrupt future Claude models? Furthermore, I don’t understand how we can get reliable CoT monitoring if it’s included in a model’s training data, otherwise won’t the issue just continue to manifest in different ways?

iamthouthouarti 8 Apr 2026 11:52 UTC
1 point
0
in reply to: iamthouthouarti’s comment on: Hedging and Survival-Weighted Planning
But wait, wouldn’t doing things like saving for retirement still make sense? Or is p(we all die) really that high

iamthouthouarti 8 Apr 2026 10:52 UTC
1 point
0
in reply to: Vaniver’s comment on: Hedging and Survival-Weighted Planning
Thanks for clarifying. I thought it might be something like that but wasn’t sure.

iamthouthouarti 8 Apr 2026 2:40 UTC
1 point
0
on: Hedging and Survival-Weighted Planning
or that you shouldn’t decide how much to invest in impact based on the overall survival probability (I’ve been playing a lot of video games)
I don’t really understand what the video games comment has to do with what was preceding it.

iamthouthouarti 28 Feb 2023 20:51 UTC
7 points
6
in reply to: johnlawrenceaspden’s comment on: Eliezer is still ridiculously optimistic about AI risk

If you get my argument, can you steelman it?

I get that your argument is essentially as follows:

1.) Solving the problem of what values to put into an ai, even given the other technical issues being solved, is impossibly difficult in real life.

2.) To prove the problem’s impossible difficulty, here’s a much kinder version of reality where the problem still remains impossible.

I don’t think you did 2, and it requires me to already accept 1 is true, which I think it probably isn’t, and I think that most would agree with me on this point, at least in principle.

Which of these four things do you disagree with?

I don’t disagree with any of them. I doubt there’s a convincing argument that could get me to disagree with any of those as presented.

What I am not convinced of, is that given all those assumptions being true, certain doom necessarily follows, or that there is no possible humanly tractable scheme which avoids doom in whatever time we have left.

I’m not clever enough to figure out what the solution is mind you, nor am I especially confident that someone else is necessarily going to. Please don’t confuse me for someone who doesn’t often worry about these things.

iamthouthouarti 28 Feb 2023 16:14 UTC
10 points
6
in reply to: iamthouthouarti’s comment on: Eliezer is still ridiculously optimistic about AI risk

I think everyone sane agrees that we’re doomed and soon.

Even as a doomer among doomers, you, with respect, come off as a rambling madman.

The problem is that the claim you’re making, such that alignment is so doomed that Eliezer Yudkowsky, one of the most if not the most of pessimistic voices among alignment people, is still somehow over optimistic about humanity’s prospects, is unsubstantiated.

It’s a claim, I think, that deserves some substantiation. Maybe you believe you’ve already provided as much. I disagree.

I’m guessing you’re operating on strong intuition here; and you know what, great, share your model of the world! But you apparently made this post with the intention to persuade, and I’m telling you you’ve done a poor job.

EDIT: To be clear, even if I were somehow granted vivid knowledge of the future through precognition, you’d still seem crazy to me at this point.

iamthouthouarti 28 Feb 2023 15:42 UTC
2 points
0
in reply to: johnlawrenceaspden’s comment on: Eliezer is still ridiculously optimistic about AI risk

I’m just trying to destroy the last tiny shreds of hope.

In what version of reality do you think anyone has hope for an ai alignment Groundhog Day?

iamthouthouarti 21 Feb 2023 0:10 UTC
2 points
1
in reply to: Vladimir_Nesov’s comment on: Bankless Podcast: 159 - We’re All Gonna Die with Eliezer Yudkowsky
I’m sorry if I’m misunderstanding- but is your claim that Yudkowsky’s model actually does tell us for certain, or some extremely close approximation of ‘certain’, about what’s going to happen?

iamthouthouarti 17 Sep 2022 5:43 UTC
3 points
0
in reply to: Daniel Kokotajlo’s comment on: Forecasting thread: How does AI risk level vary based on timelines?
As I was reading this, I remembered that we had a conversation about your timelines about a year ago, I think. If I recall correctly they were already short (~50% before 2030?). Have they dropped further since then?

iamthouthouarti 3 Sep 2022 3:51 UTC
1 point
0
in reply to: Vladimir_Nesov’s comment on: Can someone explain to me why most researchers think alignment is probably something that is humanly tractable?
I accept that trying to figure out the overall tractability of the problem far enough in advance isn’t a useful thing to dedicate resources to. But nevertheless, researchers seem to have expectations when it comes to alignment difficulty regardless, despite not having a “clearer picture”. For the researchers who think that alignment is probably tractable, I would love to hear about why they think so.

To be clear, I’m talking about researchers who are worried about AI x-risk but aren’t doomers. I would like to gain more insight into what they are hoping for, and why their expectations are reasonable.

iamthouthouarti 3 Sep 2022 2:33 UTC
4 points
2
in reply to: gbear605’s comment on: Can someone explain to me why most researchers think alignment is probably something that is humanly tractable?
This comment got me to change the wording of the question slightly. “so many” was changed to “most”.

You answered the question in good faith, which I’m thankful for, but I don’t feel your answer engaged with the content of the post satisfactorily. I was asking about the set of researchers who think alignment, at least in principle, is probably not hopeless, who I suspect to be the majority. If I failed to communicate that, I’d definitely appreciate if you could give me advice on how to make my question more clear.

Nevertheless I do agree with everything you’re saying, though we may be thinking of different things here when we use the word “many”.

[Question] Can someone explain to me why most researchers think alignment is probably something that is humanly tractable?

iamthouthouarti3 Sep 2022 1:12 UTC

32 points

11 comments1 min readLW link

iamthouthouarti 1 Sep 2022 7:29 UTC
8 points
1
in reply to: evhub’s comment on: How likely is deceptive alignment?
And then there’s me who was so certain until now that any time people talk about x-risk they mean it to be synonymous with extinction. It does make me curious though, what kind of scenarios are you imagining in which misalignment doesn’t kill everyone? Do more people place a higher credence on s-risk than I originally suspected?

iamthouthouarti 28 Aug 2022 18:29 UTC
4 points
0
in reply to: lc’s comment on: Common misconceptions about OpenAI
Thank you! I think I understand this position a good deal more now.

iamthouthouarti 28 Aug 2022 9:15 UTC
4 points
0
in reply to: Adam Scholl’s comment on: Common misconceptions about OpenAI
“the presence of which I take the OP to describe as reassuring”

I get the sense from this, and from the rest of your comment here that you think we should in fact not find this even mildly reassuring. I’m not going to argue with such a claim, because I don’t think such an effort on my part would be very useful to anyone. However, if I’m not completely off base or I’m not overstating your position (which I totally could be) , then could you go into some more detail as to why you think that we shouldn’t find their presence reassuring at all?

iamthouthouarti 3 Aug 2022 23:27 UTC
1 point
0
in reply to: johnlawrenceaspden’s comment on: What 2026 looks like (Daniel’s Median Future)
I never meant to claim that my position was “clever people don’t seem worried so I shouldn’t be”. If that’s what you got from me, then that’s my mistake. I’m incredibly worried as a matter of fact, and much more importantly, everyone I mentioned also is to some extent or another, as you already pointed out. What I meant to say but failed to was that there’s enough disagreement in these circles that near-absolute confidence in doom seems to be jumping the gun. That argument also very much holds against people who are so certain that everything will go just fine.

I guess most of my disagreement comes from 4. Or rather, the implication that having an exact formal specification of human values ready to be encoded is necessarily the only way that things could possibly go well. I already tried to verbalize as much earlier, but maybe I didn’t do a good job of that either.

iamthouthouarti 21 Jul 2022 1:45 UTC
1 point
0
in reply to: jimrandomh’s comment on: All AGI safety questions welcome (especially basic ones) [monthly thread]
I apologize for my ignorance, but are these things what people are actually trying in their own ways? Or are they really trying the thing that seems much, much crazier to me?

iamthouthouarti

[Question] Can some­one ex­plain to me why most re­searchers think al­ign­ment is prob­a­bly some­thing that is hu­manly tractable?

[Question] Can someone explain to me why most researchers think alignment is probably something that is humanly tractable?