otto.barten

Karma: 522

Space colonization and scientific discovery could be mandatory for successful defensive AI

otto.barten18 Oct 2025 4:57 UTC

16 points

0 comments1 min readLW link

otto.barten 5 Oct 2025 16:16 UTC
1 point
0
in reply to: Mikhail Samin’s comment on: We’ve automated x-risk-pilling people
Sounds promising!
Somewhat related, there was an EA forum post recently about cost effectiveness of comms from OP. They calculated viewer minute per dollar, but I think conversions per dollar would be better. Would be interesting to compare the conversions per dollar you get with our data. Maybe good to post your approach there as a comment too?

otto.barten 5 Oct 2025 14:07 UTC
8 points
0
on: We’ve automated x-risk-pilling people
Thanks for writing the post, automating xrisk-pilling people is really awesome, and more people should be trying to do it! Of course, traditional ways of automating x-pilling people are called ‘books’, ‘media’ and ‘social media’ and have been going strong for a while already. Still, if your chatbot works better, that would be awesome and imo should be supported and scaled!
We’ve done some research on xrisk comms using surveys. We defined conversion rate by asking readers the same open question before and after they consumed our intervention such as opeds or videos. The question we asked was: “List three events, in order of probability (from most to least probable), that you believe could potentially cause human extinction within the next 100 years.” If people did not include AI or similar before, but did include it after our intervention, or if they raised AI’s position in the top three, we counted them as converted. Conversion rates we got were typically in between 30% and 65% I think (probably decreasing over time). Paper here.
Maybe good to do the same survey for your chatbot? You can do so pretty easily with Prolific, we used n=300 and that’s not horribly expensive. I’d be curious how high your conversion rates are.
Also, of course it’s important how many people you can direct towards your website. Do you have a way to scale these numbers?
Keep up the good work!

otto.barten 18 Sep 2025 12:52 UTC
1 point
0
in reply to: StanislavKrym’s comment on: These are my reasons to worry less about loss of control over LLM-based agents
I agree AI intelligence is and likely will remain spiky and some spikes are above human-level (of course a calculator also spikes above human-level). But I’m as of yet not convinced that the whole LLM-based intelligence spectrum will max out above takeover-level. But I’d be open for arguments.

otto.barten 18 Sep 2025 12:43 UTC
1 point
0
in reply to: StanislavKrym’s comment on: These are my reasons to worry less about loss of control over LLM-based agents
Someone or something will always be in power. If that entity decides to not allocate any resources to most humans, true, we die. But that could have happened in an AI takeover scenario as well, depending on whose values etc. would have been in there.

These are my reasons to worry less about loss of control over LLM-based agents

otto.barten18 Sep 2025 11:45 UTC

7 points

4 comments4 min readLW link

otto.barten 9 Sep 2025 15:15 UTC
1 point
0
in reply to: Matrice Jacobine’s comment on: MAGA speakers at NatCon were mostly against AI
“Protectionism against AI” is a bit of an indirect way to point at not using AI for some tasks for job market reasons, but thanks for clarifying. Reducing immigration or trade won’t solve AI-induced job loss, right? I do agree that countries could decide to either not use AI, or redistribute AI-generated income, with the caveat that those choosing not to use AI may be outcompeted by those who do. I guess we could, theoretically, sign treaties to not use AI for some jobs anywhere.

I think AI-generated income redistribution is more likely though, since it seems the obviously better solution.

otto.barten 9 Sep 2025 8:54 UTC
1 point
0
in reply to: Matrice Jacobine’s comment on: MAGA speakers at NatCon were mostly against AI
Thanks for correcting it. I still don’t really get your connection between protectionism and mass unemployment. Perhaps you could make it explicit?

otto.barten 9 Sep 2025 8:48 UTC
1 point
2
in reply to: Matrice Jacobine’s comment on: MAGA speakers at NatCon were mostly against AI
Scifi was probably fun to think about for some in the 90s but things got more serious when it became clear the singularity could kill everyone we love. Yud bit the bullet and now says we should stop AI before it kills us. Did you bite that bullet too? If so, you’re not purely pro-tech anymore whether you like it or not. (Which I think shouldn’t matter because pro- and anti-tech has always been a silly way to look at the world.)

otto.barten 9 Sep 2025 8:10 UTC
1 point
0
in reply to: Matrice Jacobine’s comment on: MAGA speakers at NatCon were mostly against AI
I don’t really understand your thoughts about developing vs developed countries and protectionism, could you make them more explicit?

otto.barten 9 Sep 2025 8:06 UTC
1 point
0
in reply to: Matrice Jacobine’s comment on: MAGA speakers at NatCon were mostly against AI
How would you define pro-tech, which I assume you identify as? For example, should AI replace humanity a) in any case if it can, b) only if it’s conscious, c) not at all?

otto.barten 9 Sep 2025 7:57 UTC
1 point
−6
in reply to: Matrice Jacobine’s comment on: MAGA speakers at NatCon were mostly against AI
If we end up in a world with mass unemployment (like 90%), I expect those people currently self-identifying as conservatives to support strong redistribution of income, along with almost all others. I expect strong redistribution to happen in countries where democracy with income-independent voting rights is still alive by then, if any. In those where it’s not, maybe it won’t happen and people might die of starvation, be driven out of their homes, etc.

otto.barten 9 Sep 2025 7:43 UTC
1 point
7
in reply to: Matrice Jacobine’s comment on: MAGA speakers at NatCon were mostly against AI
Anti- vs pro-tech is an outdated, needlessly primitive, and needlessly polarizing framework to look at the world. We should obviously consider which tech is net positive and build that, and which tech is net negative and regulate that at the point where it starts being so.

otto.barten 4 Sep 2025 9:35 UTC
7 points
0
in reply to: Steven Byrnes’s comment on: Foom & Doom 2: Technical alignment is hard
Thanks, yeah, tbh I also felt dismissive about those projects. I’m one of the perhaps few people in this space who never liked scifi, and those projects felt like scifi exercises to me. Scifi feels a bit plastic to me, cheap, thin on the details, might as well be completely off. (I’m probably insulting people here, sorry about that, I’m sure there is great scifi. I guess these projects were also good, all considered.)

But if it’s real, rather than scifi, the future and its absurdities suddenly become very interesting. Maybe we should write papers with exploratory engineering and error bars rather than stories on a blog? I did like the work of Anders Sandberg for example.

What we want the future to be like, and not be like, necessarily has a large ethical component. I also have to say that ethics originating from the xrisk space, such as longtermism, tends to defend very non-mainstream ideas that I tend not to agree with. Longtermism has mostly been critiqued for its ASI claims, its messengers, and its lack of discounting factors, but I think the real controversial parts are its symmetric population ethics (leading to a necessity to quickly colonize the lightcone which I don’t necessarily share) and its debatable decision to count AI as valued population, too (leading to wanting to replace humanity with AI for efficiency reasons).

I disagree with these ideas, so ethically, I’d trust a kind of informed public average more than many xriskers. I’d be more excited about papers trying their best to map possible futures, and using mainstream ethics (and fields like political science, sociology, psychology, art and aesthetics, economics, etc.) to 1) map and avoid ways to go extinct, 2) map and avoid major dystopias, and 3) try to aim for actually good futures.

otto.barten 3 Sep 2025 14:33 UTC
3 points
0
in reply to: Steven Byrnes’s comment on: Foom & Doom 2: Technical alignment is hard
Thanks for the comment! I agree with a lot of what you’re saying.
Regarding the policy levers: we’re doing research into that right now. I hope to be able to share a first writeup mid October. Shall I email it to you once it’s there? Would really appreciate your feedback!
I agree that pandemic and climate policies have been a mess. In general though I think the argument “A has gone wrong, therefore B will go wrong” is not watertight. A better version of the argument would be statistical rather than anecdotal: “90% of policies have gone wrong, therefore we give 90% probability to this policy also going wrong.” I think though that 1) less than 90% of govt policies have generally gone wrong, and 2) even if there were only 10% chance of policies successfully reducing xrisk, that still seems worth a try.
I think people are generally correct to treat LLMs-in-particular as a normal technology, but I think they’re correct by coincidence.
Agree, although I’m agnostic on whether LLMs or paradigms building upon them will actually lead to takeover-level AI. So people might still be consequentially wrong rather than coincidentally correct.
it seems that you and I are in agreement with you that comms, x-risk awareness, and gradual development are all generally good, on present margins.
Thank you, good to establish.
I agree that goals we could implement would be limited by the state of technical alignment, but as you say, I don’t see a reason to not work on them in parallel. I’m not convinced one is necessarily much harder or easier than the other. The whole thing just seems such a pre-paradigmatic mess that anything seems possible and work on a defensible bet without significant downside risk seems generally good. Goalcrafting seems a significant part of the puzzle that has received comparatively little attention (small contribution). The four options you mention could be interesting to work out further, but of course there’s a zillion other possibilities. I don’t think there’s even a good taxonomy right now..?
I agree that involving society was poorly defined, but what I have in mind would at least include increasing our comms efforts about AI’s risks (including but not limited to extinction). Hopefully this increases input that non-xriskers can give. Political scientists seem relevant, historians, philosophers, social scientists. Artists should make art about possible scenarios. I think there should be a public debate about what alignment should mean exactly.
I don’t think anyone of us (or even our bubble combined) is wise enough to decide the future of the universe unilaterally. We need to ask people: if we end up with this alignable ASI, what would you want it to do? What dangers do you see?

otto.barten 3 Sep 2025 13:39 UTC
2 points
0
in reply to: otto.barten’s comment on: Proposing the Conditional AI Safety Treaty (linkpost TIME)
Late, but perhaps useful to share the final paper here: https://arxiv.org/abs/2503.18956

otto.barten 2 Sep 2025 8:14 UTC
3 points
0
in reply to: Mitchell_Porter’s comment on: We should think about the pivotal act again. Here’s a better version of it.
I don’t think that’s accurate. According to Yudkowskyan theory, as far as I know, if no one knows how to align an ASI, we’ll die straight away. In order for a pivotal act to be possible, someone needs to understand how to at least align an ASI that much. I think a pivotal act is kind of an MVP for alignment: a minimum necessity in order to not go extinct and preserve the future. Once you understand alignment even better, you might want to do fancier things as well, but in any case you’d need to find a way to make sure no one builds unaligned ASI.

We should think about the pivotal act again. Here’s a better version of it.

otto.barten28 Aug 2025 9:29 UTC

11 points

2 comments3 min readLW link

otto.barten 25 Aug 2025 16:53 UTC
1 point
0
on: Foom & Doom 2: Technical alignment is hard
Thanks a lot for writing, I think these are two fascinating pieces. I know I’m late, but couldn’t resist still writing out some thoughts.
I tend to agree that a new architecture will be existentially more relevant than LLMs (medium probability over a few decades combined with high risk on occurrence), and that this could still foom and doom (although I’m not at all certain about any of this: my p(doom) is still ~10%).
I also have some thoughts about what I think you’re saying:
- Society can do way more than you seem to think to make sure ASI (more precisely: takeover-level AI) does not get built. It seems that you’re combining “ASI is going to be insanely more powerful than not just LLMs, but than any AI anyone can currently imagine” with “we will only have today’s policy levers to pull”. I think that’s only realistic if this AI will get built very suddenly and without an increase in xrisk awareness. Xrisk awareness could increase because of relatively gradual development or because of comms such as If anyone builds it, everyone dies, comms by profs, podcasts/social media comms, activism, etc. etc. It’s definitely not certain neither of these two things (comms or gradual development) will happen. If one of them does, I think a better model would be “ASI is going to be insanely more powerful than anyone can currently imagine, and therefore we’re going to have access to insanely more regulatory options than we currently do.” In such a world, I’d be reasonably optimistic about regulating even low-flop ASI (and no, not via mass surveillance, probably).
- Although you do gesture at this, I think you’re underappreciating the importance of what some have called goalcrafting (and I think many technical AI safety researchers underestimate this). I’m somewhat relieved that you include goals such as a Long Reflection and handing over world power to a process (I’d unironically suggest a slow, boring, very democratic, woke, anti-progress UN committee). But of course there are a million ways in which we can end up with a bad universe if the wrong people/processes/things get the power. I think of the wrong people not as baddies/black hats/psychopaths, but as humans who try to do their best but have their weaknesses, both cognitively and morally, as we all do, are a bit power-seeking, and are for these reasons not up to the task of determining the future of the universe. I think none of us is. Therefore, I think if we want to have any chance of survival in a world where we develop unipolar ASI, we need to prioritize goalcrafting as much as technical alignment work. We should involve society in this, not just AI safety and tech people.
- I think the researchers inventing the ASI will possibly not realize the power of what they’ve invented until they’ll see a clear demonstration. Therefore, they won’t work on alignment until then, they won’t think of goalcrafting, and they won’t attempt a pivotal act even if they would have worked on alignment. If we want either of those things, we’ll have to somehow force researchers to do this or bring in others (e.g. by regulation).
- Reasons why I think unregulated ASI researchers themselves will probably not commit a pivotal act include:
  - They will probably not be highly xrisk-aware. They will probably not have thought about AI safety in significant detail.
  - Even if they would have, committing a pivotal act will be an extremely high-risk thing to do.
  - Committing a pivotal act would be illegal. If it would somehow go wrong or be misunderstood (which seems likely), they would probably be criminally prosecuted.
  - No one would expect them to commit a pivotal act or blame them for not doing it.
- I could imagine a gov’t, after the ASI’s power is somehow demonstrated to them, trying to block all other sources of ASI, both abroad (first in adversary countries) and at home (as public order). Using their intent-aligned ASI, they could perhaps achieve this. That’s kind of a pivotal act. The downside, of course, is that the regulating gov’t will have absolute power for eternity.
In general, I think we should think a lot better about realistic ways to make technically-aligned unipolar ASI go well. Currently, we don’t have any. I think we really need society at large for this. Apart from that, I agree that we should try working on technical alignment of non-LLM ASI as you suggest.

otto.barten 7 Aug 2025 6:42 UTC
1 point
0
in reply to: Yair Halberstadt’s comment on: I am worried about near-term non-LLM AI developments
Doing the same with a different architecture could open up the possibility of doing better down the road. I’d be equally interested in how fast it gets better as in how good it is. It would also beg the question: if two architectures can do this, how many more? Do they all max out at the same point or not at all? I think it could be quite important. Would be curious how big experts think the probability is that a different architecture could do LLM level thinking on a reasonably wide range of tasks in say five or ten years.

otto.barten

Space coloniza­tion and sci­en­tific dis­cov­ery could be manda­tory for suc­cess­ful defen­sive AI

Th­ese are my rea­sons to worry less about loss of con­trol over LLM-based agents

We should think about the pivotal act again. Here’s a bet­ter ver­sion of it.

Space colonization and scientific discovery could be mandatory for successful defensive AI

These are my reasons to worry less about loss of control over LLM-based agents

We should think about the pivotal act again. Here’s a better version of it.