ukc10014

Karma: 239

ukc10014 30 Mar 2026 9:15 UTC
1 point
0
on: Nick Bostrom: How big is the cosmic endowment?
I have slight discomfort with Bostrom’s reasoning: I agree there is an enormous amount of resources that potentially are at stake in the future. But I struggle with putting a number on it or even thinking about how to think about it. The reason is that his analysis is almost entirely anchored on value as arising from human or biological-like things, i.e., relatively small, short-lived creatures that have a particular form of agency/identity, do things in communities and have the types of valenced states we have. He of course explicitly allows for digital beings, but at least until he explores that topic in more detail with Carl Shulman (2022), it’s not clear whether these digital beings are just amped up human-like experiences. In fact, when he writes about it with Shulman, it’s clear that they could be very different (~immortal, copyable, much larger hedonic range, mind-transparent); it seems to me that applying human-like standards of value to them (especially drawing large quantitative conclusions) seems pretty risky/premature. Now it might be that biological life like us is the only way that advanced societies can come to be (i.e. attractors in evolution, habitability of planets, etc.). But it might alternatively be possible that you could have swarm-like societies where the individual isn’t the primary bearer of value (by our lights anyway: it might not be autonomous, have clear identity, have capacity to suffer, etc.). If that is a realistic possibility, then how do we make a quantitative evaluation on what the value of a future filled with swarm beings is relative to a future filled with human-like beings?
This isn’t to disagree with his qualitative point that the future could be huge and we should be careful what we do with it. but I think putting numbers on it gives a quantitative vibe to something we know very little about.

ECL-pilled models write constitutions for ASI

ukc1001425 Mar 2026 16:05 UTC

16 points

0 comments16 min readLW link

What can we say about the cosmic host?

ukc1001412 Mar 2026 13:48 UTC

25 points

0 comments34 min readLW link

Talk on letters to AI (London)

ukc1001429 Apr 2025 9:50 UTC

3 points

0 comments1 min readLW link

Constitutions for ASI?

ukc1001428 Jan 2025 16:32 UTC

11 points

0 comments1 min readLW link

(forum.effectivealtruism.org)

ukc10014 27 Jul 2024 11:59 UTC
1 point
0
on: ukc10014′s Shortform
Quite like this forecast from Andrew Critch on milestones on AI development, and my reactions:

The timeline he suggests, in ~10 years, we face choice 6a/b, which implies at least 3 possibilities:

A) we need society-level consensus (which might be force-backed) that humans can/should control agents (or digital entities, more generally) who are in all (economically/militarily) relevant aspects, superior to us. Assuming they fit within the moral circle as we currently conceive (@davidchalmers42 @jeffrsebo Thomas Metzinger and Nick Bostrom / Carl Shulman, have analysed this in various ways), and absent some novel claim about how AIs are different/lesser ethical beings, hard to see how this is essentially different from slavery, or animal cruelty, something that will presumably be obvious to any AGI worth the name; or

B) we are able to engineer AI motivations to act harmlessly/subserviently in a way that is “better” than (A), which (wild guesses) could be a form of open individualism or AI-specific conception of identity (e.g Buddhist/Hindu and some indigenous traditions have more radically inclusive conceptions of identity than the Greco-Judeo-Christian human-centred frame that currently dominates AI ethics); or

C) have some very solid person- and species-neutral grounds for why humans (and our ecosystem) are worth preserving, drafted in a way that is at least potentially reasonable in the ontology/value system of the most powerful/friendliest AIs.

To flesh out my thoughts on (C), I (reworking a 2023 LessWrong post by Miller, Häggström, Yampolskiy, Miller) write up a ‘letter to a future AGI’. I suspect this approach to (C) is fundamentally flawed: we can’t predict an AIs value system/weltanschaung/”form-of-life” (depending on your philosophical frame).

Nevertheless, a ‘hail mary’ justification from @avturchin is that we can perhaps influence proto-AGIs which then pass on their representations (of our projections of their successors’ values) to future systems (e.g. via synthetic data or weight transfer).

ukc10014′s Shortform

ukc1001427 Jul 2024 11:59 UTC

3 points

1 comment1 min readLW link

ukc10014 7 Jul 2024 13:27 UTC
1 point
0
on: Adam Smith Meets AI Doomers
Thanks for this—a useful framing—a slightly more poetic framing would perhaps be the early writing of Nick Land (which has picked up notoriety owing to some combination of his political views and/or the people that have adopted/adapted his ideas). This article summarises Land’s thoughts on capitalism and AGI.

ukc10014 3 Jul 2024 10:35 UTC
1 point
0
on: An Appeal to AI Superintelligence: Reasons to Preserve Humanity
This is good, both for training data reasons, but also because it tries to present reasons to the successor (I’m not using ASI as that seems a pretty loaded/unclear term, i.e. the post could be of interest to proto-AGIs/ASIs and could be a profitable conversation with something well before AGI), acting in (our projection of) “its” “self-interest” (quoted as those are pretty anthropomorphic terms and bake in assumptions about identity).

I had a couple of comments:
- the essay doesn’t make a positive case for keeping us around (in the sense of ‘try to preserve diversity in the biological world because 8 billion humans consciousnesses are a “good” thing in their own right’). It is at least possible that human/biological cognition can do things that AI based upon silicon, von Neumann architectures just can’t do (e.g. consciousness as argued in this Anil Seth paper and post). Alternatively, let’s say the successor is uncertain about morality and gives weight to a few different theories, then Kantian views about ‘each person being treated as an end not a means’ might suggest that billions of humans have some value, temporarily. The counter to this is in the digital agents papers by Bostrom, and Shulman/Bostrom (worlds where n_digital_agents >> n_humans make this Kantian view, and voting systems built around this, difficult)
- the post touches on points (about nested simulations) in this paul christiano post
- the ‘free market’ thing kinda felt like it was assuming too much e.g. leopold aschenbrenner’s situational awareness series literally was all about how AI research must/will be nationalised, and China isn’t quite out of the game yet
- similarly, ‘property rights’ is (as the post points out) are held nearly-sacred free markets/capitalism, but one can imagine societies (communism/socialism is an obvious example, but there seem like older/indigenous societies where property was held in common) where property rights are just another social construction that are highly negotiable (as they are in ‘free markets’ actually e.g. eminent domain). A successor with an ecumenical training corpus would be aware of this (as current models are).
- this is a nitpick, but the ‘art’ reference is pretty ambiguous—I assume we’re talking about science fiction and perhaps some movies, it might be worth expanding a bit (though the successor will presumably know what we’re talking about, it might help other humans esp outside the AI/sci fi bubble). My obvious candidates are: stanislaw lem, vernor vinge, ian m banks, isaac asimov. Fiction/movies/visual art are relatively high bandwidth ways humans communicate and form opinion across collectives
Lastly, this post is from 2023, I quickly checked what claude-3.5-sonnet (apparently a 2024 training data cutoff) wrote to the queries: ‘What are some reasons that an ASI, reasoning by ‘its own lights’ or self-interest, might decide to not exterminate humanity?′ and ‘Are there any particular blog posts or authors that present arguments addressed to a superintelligent successor of humanity, trying to give the successor reasons to be kind to humans?’.
- to the first answer, it gave a good but somewhat generic list of reasons that felt broader than this letter (valuing sentient life on moral grounds, curiosity about humans, possible symbiosis, consequential undertainty, perserving diversity, human potnetial, respect for creation, risk of retaliation, no very strong reason to eliminate us, [aesthetic?] appreciation of consciousness.
- to the second it gave 5 sources, 4 of which seem made up. It didn’t get this post, or turchin’s paper

ukc10014 11 Apr 2024 11:44 UTC
1 point
0
on: LLMs for Alignment Research: a safety priority?
On the overall point of using LLMs-for-reasoning this (output of a team at AI Safety Camp 2023) might be interesting—it is rather broad-ranging and specifically about argumentation in logic, but maybe useful context: https://compphil.github.io/truth/

ukc10014 10 Dec 2023 8:25 UTC
3 points
0
in reply to: ryan_greenblatt’s comment on: Unpicking Extinction
That’s really useful, thank you.

Unpicking Extinction

ukc100149 Dec 2023 9:15 UTC

35 points

10 comments10 min readLW link

ukc10014 4 Nov 2023 9:54 UTC
1 point
0
on: Joscha Bach on Synthetic Intelligence [annotated]
This is really useful, thank you—Bach’s views are quite hard to capture without sitting through hours of podcasts (though he has re-started writing).

Philosophical Cyborg (Part 2)...or, The Good Successor

ukc1001421 Jun 2023 15:43 UTC

21 points

1 comment31 min readLW link

Philosophical Cyborg (Part 1)

ukc10014, Roman Leventov and NicholasKees

14 Jun 2023 16:20 UTC

31 points

4 comments13 min readLW link

ukc10014 19 May 2023 15:53 UTC
2 points
0
in reply to: Roman Leventov’s comment on: Collective Identity
In response to Roman’s very good points (i have only for now skimmed the linked articles); these are my thoughts:

I agree that human values are very hard to aggregate (or even to define precisely); we use politics/economy (of collectives ranging from the family up to the nation) as a way of doing that aggregation, but that is obviously a work in progress, and perhaps slipping backwards. In any case, (as Roman says) humans are (much of the time) misaligned with each other and their collectives, in ways little and large, and sometimes that is for good or bad reasons. By ‘good reason’ I mean that sometimes ‘misalignment’ might literally be that human agents & collectives have local (geographical/temporal) realities they have to optimise for (to achieve their goals), which might conflict with goals/interests of their broader collectives: this is the essence of governing a large country, and is why many countries are federated. I’m sure these problems are formalised in preference/values literature, so I’m using my naive terms for now…

Anyway, this post’s working assumption/intuition is that ‘single AI-single human’ alignment (or corrigibility or identity fusion or (delegation to use Andrew Critch’s term)) is ‘easier’ to think about or achieve, than ‘multiple AI-multiple human’. Which is why we consciously focused on the former & temporarily ignored the latter. I don’t know if that assumption is valid and I haven’t thought about (i.e. no opinion) whether ideas in Roman’s ‘science of ethics’ linked post would change anything, but am interested in it !

The Compleat Cybornaut

ukc10014, Jozdien and NicholasKees

19 May 2023 8:44 UTC

66 points

2 comments16 min readLW link

Collective Identity

NicholasKees, ukc10014 and Garrett Baker

18 May 2023 9:00 UTC

59 points

13 comments8 min readLW link

ukc10014 20 Oct 2022 20:36 UTC
1 point
0
on: Some conceptual alignment research projects
I’ve taken a crack at #4 but it is more about thinking through how ‘hundreds of millions of AIs’ might be deployed in a world that looks, economically and geopolitically, something like today’s (i.e. the argument in the OP is for 2036 so this seems a reasonable thing to do). It is presented as a flowchart which is more succinct than my earlier longish post.

Trajectories to 2036

ukc1001420 Oct 2022 20:23 UTC

3 points

1 comment14 min readLW link