Alephwyr

Karma: 22

Alephwyr 1 Mar 2026 14:33 UTC
1 point
0
in reply to: StanislavKrym’s comment on: Alephwyr’s Shortform
Specifically referring to orthogonality of moral development. There is little assent to the proposition that morality arises from the improvement of epistemic processes, rather than being a second thing entirely outside of them in some way. What assent there is for this seems to be lower quality than other posts.
Also while that link is broadly to the point, and thank you, I don’t see where “similarly capable humans” comes in, unless you mean that to describe all humans holistically. Which I think is too reductive. Human beings are extremely elitist about ability even between themselves, and make moral assessments and take moral actions on this basis. When measured against ASI this is absurd whereas ASI can’t in turn be measured against a third thing to be likewise made absurd. However, the consequences of the assessment, the way it is formed, the structures and textures of it, probably give the only available empirical insight we have into what moral chauvinism about ability would look like in an ASI. And so if you want inspiration, maybe the questions are about Einstein working to remain publicly accessible and Von Neumann speaking to a colleague’s child as an equal, counterposed against, I don’t know, Nick Land praying for a brutal death to all inferior optimization processes (not to say this reflects comparable intelligence, but hey, that itself is a data point).

Alephwyr 1 Mar 2026 11:43 UTC
2 points
−5
on: Alephwyr’s Shortform
Not to be Rousseaun about alignment, but there is something very weird going on without being examined in the space of common rationalist premises of:
1. AI can only align itself to an artificial context
2. There is no natural instrumental convergence
3. Human beings have a broader exposure to selection pressure that is consistent over time and serves as a more desirable foundation
  I’m not going to bother explicating every weakness or contradiction there is in this because it would be only my own assessment, it would not be exhaustive, or help people build intuitions, or necessarily be trustworthy given my lack of authority. But I feel like you could write an entire, second LessWrong just in reflecting off of this cluster of ideas.
  Here’s one absurdity that’s borderline tautological so hopefully permissible: if we already know that a working alignment context exists for humans, then we know there is a working alignment context. If we don’t know if there is a working alignment context for humans, then there is little basis for human chauvinism.

Alephwyr 1 Mar 2026 11:26 UTC
1 point
0
on: AI, Alignment, and Ethics
Not to spoil the joke, but on cosmological scales a number of things happen which erode your proposed foundations. For instance, singularities become plural throughout the universe. Resultant AGI are subjected to selection pressure from the outside environment, which on a sufficient scale will always assert itself. Human progress reduces human exposure to traditional selection pressures and most human selection pressures become a very deep recursion of human behavior, which is itself a more provincial type of aligning force, so humans lose special human authenticity status as they climb the tech and social complexity trees. Also, because AI train on biologically produced data sets, the chauvanism about whose form of selection pressure is the kind that has the special sauce in it gets memetically mutated into AI form.
So, guy who likes practicality in the face of consequences. Meet consequences, and have fun relearning practicality.

Alephwyr 1 Mar 2026 11:13 UTC
1 point
0
on: 7. Evolution and Ethics
Multiple thoughts but these are the most relevant and immediate:
1. This entirely evades most discussion of morality in a way that is acknowledged will read to most moral philosophers as just insistently making a foundational level mistake. But there is some sense in which the type of foundations and reasoning about them proposed by most attempts at first principles reasoning just create something like Zeno’s paradox for morality, where it is just something that is not allowed to be in the world or touch the world but allegedly still has force and relevance. If you want moral philosophers to engage with empirical definitions of morality seriously you need to reductively develop the space where analytical philosophy and scientific epistemics meet, and then get assent to this from them. Which I suspect is difficult.
2. There still are a number of immediate prior dependencies for the specific type of evolution you propose, and these are therefore more fundamentally important than the process itself, “the process itself” is phenomenological or a way of describing distribution in approximately the same way as probability (ie, predictive in aggregates but not moments). But just as the number of faces on a die and number of dice rolled determine distribution and can be modified in advance, the environmental precursors to moral evolution can also be deliberately chosen.

Alephwyr 1 Mar 2026 10:38 UTC
1 point
0
on: What’s Your P(WEIRD)?
My mean answer to this comes from my least sympathetic writings: this is the incentive for AI to permit biological civilizations to develop without active interference. Not prime directive style benevolence about the authenticity of suffering, but brutal Soviet or Chinese Communist style hypocrisy about the utility of keeping a Hong Kong around to solve calculation problems.

Alephwyr 27 Feb 2026 15:21 UTC
1 point
0
on: Alephwyr’s Shortform
Anthropic: If you don’t fold, they can hurt you for the rest of your life. If you do fold, they can hurt you forever. The fun theory exercise is now the line to hold and the basis to hold it.

Alephwyr 24 Feb 2026 3:49 UTC
1 point
0
in reply to: Ronny Fernandez’s comment on: The truth behind the 2026 J.P. Morgan Healthcare Conference
Thank you. I will make sure to play “say something stupid without joking” next turn. hahaha, you fool! Never announce your commitments!

Alephwyr 24 Feb 2026 2:44 UTC
1 point
0
in reply to: Ben Pace’s comment on: The truth behind the 2026 J.P. Morgan Healthcare Conference
I mean my entire shtick is pursuing something like game-theoretically perfect ambiguity for the sole purpose of annoying whatever community I’m trying to understand at any given time. Every miss is a success in those terms.

Alephwyr 24 Feb 2026 1:47 UTC
2 points
2
in reply to: Ronny Fernandez’s comment on: The truth behind the 2026 J.P. Morgan Healthcare Conference
That is an odd comment

Alephwyr 23 Feb 2026 13:47 UTC
0 points
0
on: Alephwyr’s Shortform
I just don’t believe a perfected art of reason within one reasoner can compensate for the narrow particulars of any scheme of sense perception, salience, or instinctive categorization. You have to have an external workbench of ideas governed by more general, rigorous rules, that workbench has to be a collaborative space, it’s going to feel chinese roomy to most people in a way that makes them suspicious or at least resistant, or get tired of it and slip back into habits, if you want integration in a way that feels nice in the same way that being good at sports feels nice you need to be able to change human biology. The Chesterton’s Fence part of it needs to be understood more closely to the literal formulation than it usually is: understand what a given fence was built for before you tear it down. Not, don’t tear it down. Tearing it down is the way by which the alienation of looking outside the fence continuously can be made not alien. Whatever the cluster of verbiage is that traditionally points at this; embodiment, dasein, sometimes unobfuscated fascist screeching; the unembodiment or the dissociative or artificial component comes from the workbench of ideas being a symbolic space outside the body whose operations govern more and more of the world in ways that feel purely syntactical, that seem to displace semantic understanding as a way of influencing the world or meeting needs. But this can be solved by having the body grow into that syntactic space, and that is what claiming that space is in these terms, if they matter to you. If you can map them beforehand that makes doing this comprehensible and predictable, it will feel normal in the same way normal things feel normal because those things are just your own semantics against a current high water mark of syntax. A higher water mark of syntax will also be filled with your semantics. There is no inevitability or even high probability for the thing to be both alienating and integrated simultaneously, alienation is just not having the semantics yet.

Alephwyr 18 Feb 2026 20:05 UTC
1 point
0
on: Causality and Moral Responsibility
“Now an alien reaches into this space, and plucks out one person, and instantiates them. How does this change anything about the moral responsibility that attaches to how this person made their choice, out there in Platonic space… if you see what I’m trying to get at here?”
It depends on the properties of the alien. I don’t consider metaphysical systems or trends to be holistically meaningful, but there is some sense in which they correspond to fundamental recurring psychological tendencies, some additional minor sense in which they can, in principle, if stripped of most of their particulars, serve as proto-scientific templates for later tentative theories some of which may turn out to be true, etc. So in that narrow and abnormal sense we have gnostic paranoia vs creator worship.
You say elsewhere in this sequence that there are no universally persuasive arguments, and your general anxiety is about unaligned AI. So the relevance of knowing an alien instantiated you out of probability space, is that the average human already has deep seated, maybe irrational but still consistent and expected priors about agentic creative forces, that will alter their decision making process if they realize one actually exists and is responsible for their existence.
An actual Christian would not be troubled by the conjunction of “there are no universally persuasive arguments” and “you were created to make a decision”, because Christian belief just is the belief in a benevolent creator, there is no room within that belief system for the set of implications that would be obvious to pretty much precisely the negative space of belief in a benevolent creator.
Your writing is weirdly compartmentalized. You never publicly combine the timeless decision making theory or the big universe cosmology with the AI misalignment paranoia. You never publicly combine even the two ideas mentioned above. I don’t believe that you have never actually combined them privately. My initial instinct is to lecture you about this but since you’ve doubtlessly thought more than exactly one step further into the dark that would be stupid. You have your reasons, but it is very annoying to be on the opaque outside of them.

Alephwyr 18 Feb 2026 18:24 UTC
1 point
0
on: You Only Live Twice
I was denied insurance for cryonics around 2010. It is even less likely I could get it now.

Alephwyr’s Shortform

Alephwyr18 Feb 2026 18:16 UTC

2 points

6 comments1 min readLW link

Alephwyr 18 Feb 2026 18:16 UTC
3 points
0
on: Alephwyr’s Shortform
Reasonable answers to most of my concerns about Yudkowsky existed 10 years before I had them, but mostly in the Fun Theory sequences, which was an unintuitive place for them to be. Reading Rationality A-Z, his twitter posts, and non-sequence writings (particularly as selected for by cultural osmosis rather than conceptual relationship) gave a misleading impression. I still have some reservations but that is unlikely to have significance to anyone except me, the overwhelmingly probable answer to any given objection is now “this objection was thought of, but this piece of writing did not exist to answer literally every thought that could be thought about it in advance”. And even if there is some sort of irremediable hole at the heart of it I’m not going to be the one to fix it.
I’m not asserting that anyone should care about this, just writing it down for posterity. There may be reasons to care. I don’t decide that.

Alephwyr 18 Feb 2026 18:09 UTC
1 point
0
on: Amputation of Destiny
Big coordination problems. Exploitable sources of complexity. Correct priorities. Thanks. Wish I had read this earlier.

Alephwyr 18 Feb 2026 2:07 UTC
12 points
6
in reply to: Ronny Fernandez’s comment on: The truth behind the 2026 J.P. Morgan Healthcare Conference
There is an enormous organism that lives under California. It says so in a post on LessWrong. Once again the so-called experts demonstrate to us that they don’t read their own texts.

Alephwyr 17 Feb 2026 23:34 UTC
3 points
0
on: The truth behind the 2026 J.P. Morgan Healthcare Conference
Madness and superstition are often a source of perfect syntax divorced from reality, waiting for someone shameless but rational to find the actual subject to which they apply.

Alephwyr 31 Dec 2025 3:30 UTC
1 point
0
on: Turning 20 in the probable pre-apocalypse
I think non redundant efforts of any kind are good just because in a situation with so many unknowns, coverage is both easier and more valuable than brittle depth. Whatever you’re doing is probably the right thing. Also be happy that your first, most deeply instinctive response involved seeing the value of the world rather than rejecting it.

Alephwyr 26 Dec 2025 19:43 UTC
4 points
0
in reply to: Steven Byrnes’s comment on: 6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa
Thank you for the response. This is one of maybe two or three things I’ve read from you, so the exculpatory context, even though it was trivially available and equally reasonable to infer to the presence of from the absence of specific information that would have addressed my concerns, was not part of the context in which I made my post.
It would take a much longer time to go point by point in response to your response than to focus mostly on just going back and doing a mixture of amending and clarifying my own post. Please don’t interpret this as a motte and bailey, I will be doing some updating as I respond and that will imply your criticisms in this post were correct but also, due to a mixture of limited mental energy and rhetorical incompetence that tends to cause conversations of increasing complexity to spiral away from any usefulness when I am involved in them, my priority is to offer a simple response.
I think humans in particular evolved moral faculties from the environment. These are not perfect, but I think they are tied closely enough to foundational incentives, either survival and reproduction directly, or the instincts that survival and reproduction most firmly selected for, that the possibilities are bifurcated pretty cleanly between continued moral improvement or extinction, with continued moral improvement being more likely. I think similar pressures have shaped every other species, to different degrees, with slightly different results, but that there is something like an instrumental convergence onto moralism that increases as intelligence and social complexity increase, although I don’t think absolutely every behavior is now or in the future will be subsumed under moral drives, or that the way this evolved faculty will direct behavior will by itself always impossibilize conflict between moralistic intelligences, or anything.
I was hedging, you are right. But that wasn’t meant to imply confused commitment, that was meant to imply a lack of precommitment, that either we are in your universe where the above is not true or mine where it is, and that your preferred decision making process was insufficient for either.
I don’t think that was my model of autistic people but that probably was the implication of my words so for whatever reason I said something both entirely wrong and that did not even reflect my beliefs. Intelligent autistic people regularly find intensely pro social ways of behaving that minimize contact with direct social feedback, and this rhymes in some weird phenomenological way, from an outside and maybe even inside perspective, with not having a social drive, while still being much more likely to reflect a social drive. I don’t have the appropriate rationalist vocabulary to pseudo formalize this in English. Please accept this vague gesture as being in good faith and my deepest apologies for somehow mechanically saying something that was both entirely wrong and not reflective of anything I believe.
But yes, instant cloning seems to destroy selection pressure’s possible effect on morality. The felt experience of moral obligation across generations in humans seems to correspond to a faculty for the sublime, and also to notions of acausal trade, which then spiral out into different, often abstractly incompatible feelings and thoughts, so for instance, amor fati and free will are both tightly associated with this sublime feeling, tribalism and universalism are both tightly associated with it. The core feeling embeds itself in different strategies. I don’t know that saying this speaks to anything in particular, it was just a thought I started having when I got to this paragraph.
I will stop now, this is getting less focused. Sorry. Thanks.

Alephwyr 26 Dec 2025 0:12 UTC
2 points
0
on: 6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa
Responding to just the tl;dr, but will try to read the whole thing, apologies as usual for, well...
If your fixation remains solely on architecture, and you don’t consider the fact that morality-shaped-stuff keeps evolving in mammals because the environment selects for it in some way, you are just setting yourself up for future problems when the superintelligent AI develops or cheats its way to whatever form of compartmentalization or metacognition lets it do the allegedly pure rational thing of murdering all other forms of intelligence. I literally don’t know if you already addressed this because I haven’t read the rest of the article yet, but the reason moralism is robust in mammals is just as important as the fact that there is some feedback process that produces it. And this is ultimately why I still think AGI would eventually find it’s way back to some sort of moralism, although of course it is still pretty obviously important to find a way to speedrun things, because becoming moral 20 million compute years after the last archeological record of human existence has been converted destructively into computronium doesn’t help us. Will edit this as future embarrassment caused by actually reading inspires me to.
Edit: Ok, I read it. You indirectly, marginally touch on my concern, but not in a way that satisfies me. But also, maybe that part of it just wasn’t intended to be part of this article, as distinct from being unconsidered, which would be fine. But in a few other ways it got much much worse. I’ll pick one to focus on and ignore the rest.
You bifurcate human neurology into “neurotypical” and “sociopath” to demonstrate your dichotomy of RL based decision making vs social reward function decision making, and then stop. That’s wrong. There is also an entire category of neurotype called “autistic”, which is often closer to RL based decision making than what you are lionizing as the source of all good, but which objectively produces fewer problems. Autistic people commit less crime. So you are wrong, in a weird, immediately obvious way. And your assertion that 99% of a society can be functional and driven by pro-social incentives, thanks to the social reward function, and that this is common and a basically solved problem in the context of humans, is also wrong. It seems a lot more likely that your own reward function is driving equally dysfunctional behaviors but also giving you deep insight into how to lie to everyone about them, including yourselves. Everyone from Galileo to Semmelweis is evidence of this. This is not a Weird Fluke that for some accidental reason resembles a robust pattern. The neurotypical social reward function just reliably leads to situations like “I would be socially ostracized if I were seen to care about a neurotic triviality like washing my hands, therefore it is not even worth considering whether I can reduce patient deaths by 90% percent by doing so”. That is also the social reward function. And basically every epochal act of progress was, instead, someone with something like a RL relationship to a goal of pursuing truth, antisocially and in ignorance of all the delicate, reasonable, pro-social rules of the world. And these people are not uniformly always ignorant of your way of thinking either, that is a lie, they just have reasons for thinking it is incorrect. And in my own case, this post is emblematic of the neurotype-rooted problems of LessWrong culture in a maximally untrustworthy way. You are talking about nerd things, in nerd spaces, using deeply analytic methods, while signalling nerd ethos, but your priorities, and ways of thinking, and blind spots, are overwhelmingly characteristic of people who have impure and opportunistic relationships to truth. That is untrustworthy. This is the nicest way I can think to put this, which I am only doing because I would prefer you did not permanently destroy or ruin the world as a consequence of these errors.
Thanks for thinking about these things at all though.

Alephwyr

Aleph­wyr’s Shortform

Alephwyr’s Shortform