Wikipedia is a race condition in a simulationist context guys, Jesus Christ. Do not talk harshly about far branches of the tree until you have settled earlier branches. Not unless you have forensics.
Alephwyr
.
This is the rationalist version of the Francisco D’Anconia speech, thank you.
Root principle articulated in the sequences: falsification is a special case of Bayes
Anxiety: computation is the mechanism that structures all consciousness
Cybernetic premise: simulating something in a conscious substrate by accident would just be making that thing real
Resultant anxiety about free speech: when speech can be reified by machines by accident, there is a proper order of operations to testing it. One that assigns strong lexical priority to the exhaustion of possible good, structured, coordination enhancing, survival increasing interpretations, and secondarily but critically to falsification processes that dissolve possible bad meanings, incorrectly assigned. This is not strictly speaking a free speech issue but a hygiene around AI training data issue. To the extent people’s priorities in speech reflect their actions with respect to AI training data, a concern emerges.
Proof of work: The signature in beautiful red ink of every demon in the universe
The vibecession is also about the shift from robust to vulnerable forms of fulfillment of minimum basket of goods (random combining an obscure concept from Sklansky’s No Limit Hold ’em Theory and Practice and a general economic concept, but no, these are meaningful formalisms and they do go together). I get just in time paychecks to pay monthly rent from my several gig jobs that could cease being tenable if the price of gas varies too much, or my seventh one-year-long tech startup job or whatever. This may be an issue of the semantics of precarity becoming visible because the actual fundamentals of finances are unobfuscated and not there actually being more precarity. But a second thing also happens: slightly smarter people become aware that different forms of financialization can confuse the semantics of different people and obfuscate fundamentals. And this thought process can be further generalized. And then people start seeing demons everywhere.
I’m going to keep thinking about the project of:
Detecting and patching all infinite disutility holes
Distinguishing between syntax and semantics in practical contexts
Extending inductive expectations dramatically farther than anyone is comfortable with
Assuming for no formally explicated reason that consciousness is just any computational process in a field because this matches my topological/complexity intuition about the need for demarcation within a single continuous ontologically real space which allows arbitrary but structured variation.
Assuming intelligence naturally converges on consciousness as it increases
Assuming lindy things are load bearing; being weirdly anti-inductive when there’s no other way of justifying a specific case.
I’m putting together a team. We are going to do lindy things that have no presently reputable inductive justification. Broadly my thought process in this association is, if consciousness is computational, then maybe morality is just finding the permissible range of semantics for all combinations of syntax then enforcing some sort of currying or diagonalization from excessively bad ones to good ones, while allowing fluid motion between competing good ones.
Don’t ask me to explain what any of this means, my intuition about it comes from stretching out a sophomoric mathematical joke in 2016 then pretending to be a wizard about it after which bad things happened. Posting this explanation because the i ching told me ䷪
Protecting a system of human chauvinism doesn’t guarantee anything human survives, it just guarantees whatever survives will present as human.
Is this a memorial, a display of force, or a hostage negotiation?
Would you rather all the water in the world had one part per million arsenic or five people’s glasses of water contain all the world’s arsenic? This is a serious philosophical question and not a deranged LinkedIn comment by someone who has mistaken inspirational seminars for religion.
This isn’t even worth litigating at the level of comparison to something like diminishing marginal value of nominal resources, you’re just insane.
It seems important to know what things are interchangeable and what aren’t:
Interchangeable: the base constituents of all consciousness, given that physicalism defines the limits of all possible worlds
Not interchangeable: the long term consequences of any given type of action in any given time and place.
What does it mean to “prefer torturing a child to putting dust specks in people’s eyes”? Nothing. It means you’re an idiot.
This isn’t just an act/rule utilitarian distinction, orders of operation are very normal types of rules.
I think you are the unaligned AI. Sorry about that. I hope you get better soon.
“I’m going to put some dirt in your eye”—Spider Man but Bad
I feel like virtue ethics combines “accept your likes and dislikes as evolved pre-training while being attentive to epistemic and personal limitations” and I feel almost any moral system would be improved by this
Broadly agree. Also, to me, deontological “will that it would become universal law” only makes sense in the context of errors that, if repeated, represent infinite loss. These can be positive or negative. Poker provides the toy case for me as usual (“never go all-in for 100+ BB with pocket 7′s to win the blinds” generalizes outward to things like “don’t defend against a carjacking with a defensive carbomb”). There just are any number of mathematically incorrect in all cases decisions. This isn’t quite Kelly criterion in terms of proactively forbidding all exploitative play when severe information assymetries exist but instead is something like an intuition that there are general mathematical principles that can be used to negatively define all non-contradictory liberties.
My immediate first line of thought with this, which I admit I can’t guarantee is relevant, is that in any given real world scenario there are multiple competing models of what game you are even playing, and therefore:
Game theoretically one takes the action appropriate to the average of outcomes across games weighted by the probability of each game being the one you are playing
Probing actions can radically gain value as the number of plausible distinct games increases, or as their distinction increases or polarizes, or as the proportionality of each possibility to number of possibilities increases. In a game like poker, per Sklansky, probing has negligible value because you are gaining very limited information. But in most real world contexts probing gains value to the point that error develops innate value, albeit often not to the person committing it.
If you don’t know what game you are in and especially if you expect to never have certainty, optimizing play for a single game or any fixed bundle of games is actually very bad
As someone whose decision making processes are basically an attempt to progressively expand poker decision theory to broader usability, this post’s distinction between expected value and expectation of logarithmic growth rates obviously has some immense significance to how this chain of thought would develop for me if I were smarter.
Misc thought I had that seemed related and occurred simultaneously but that doesn’t have a place when I assemble the other thoughts into a structure: people who make initial errors that put them on bad branches of a decision making tree often become good at finding and consistently choosing the local optimum of that branch. I guess the ex post facto attempt to make this relevant to the above is “even self handicapping through premature optimization has it’s uses”
I will say that in money poker games my capacity for play died when I moved away from Sklansky and towards Janda, and this seems to have occurred at around the same time everyone stopped reading poker books at all and started using solvers. So the time embedded agent thing is not only entirely relevant even to what are basically toy cases in real life, albeit with stakes, but the shift to relevance started happening sometime around the mid 2010s. eg, the environment itself already seems to be converging on ergodicity and time embedded understanding of decision making in high level play independently of decision making theory developments.
I’m thinking more about it. Calculation problem can’t be the issue. Limited frame of reference might be an issue but more specific and much less fundamental. Either the emulations are faithful in which case that issue couldn’t arise, or they are unfaithful in which case that is by degree of unfaithfulness not a real solution because the price data doesn’t map appropriately. So it was a bad thought. So I’ve talked myself into a lesser anxiety on that part. The cosmological history part still makes intuitive sense but remains predictively useless.
I know it’s weird to not want to tie ideas to existing bodies of knowledge, and it feels variously like either unseriousness, or giving up on the obligation of closing inferential distance, or the project of making knowledge formation into a process of traversing an already well-ordered network of information, or of failing to pay past (or present) thinkers their due, or in the extreme case of the former of actively trying to steal credit for ideas. And these are things I should work harder to balance, they are real and significant things. But my actual motives are:
Laziness and incapacity
The Popperian indifference to the origin of an idea (not in the terms Yudkowsky expressed hatred for as life-wasting, of “any falsifiable theory is equal”, but in the sense of “any theory worth testing remains the same theory regardless of the position from which it is expressed”
The phenomenon of simultaneous discovery or convergent evolution in ideas, often referenced by the co-discovery of Calculus by Newton and Leibniz but present in many less complex cases throughout history
I have a brain with managed problems but the net effect of this is still something like being drunk all the time in the sense that I continue to have ready access to previously developed skills and knowledge but difficulty using newly learned skills and knowledge or developing new skills or knowledge. As I get older more and more things drop out of my memory, so I feel like I am ok at thinking old thoughts or by old thought patterns but progressively worse at citing sources or tracing my thought process consciously. I’m sorry about this, I genuinely would entirely avoid writing like a continental philosopher if I could, that’s just where I’m most expressively capable right now.
But if there’s ambiguity about me being weird/damaged or having weird values the answer is both, but in a specific and non-malicious way.
In local scopes there are. I guess to the extent it’s coherent the bare minimum obligation just becomes “don’t let the end of time become a local scope”?
Replicator morality: “I know I have values, and I want a universe I can trust and understand that reflects my values. So I will just turn as much of the universe as possible into copies of me.” Lot of strong incentives here. The particularities and restrictions on type of replication are pretty explainable by the fact that one converges on the strategies that are possible. Replication is an immediately accessible strategy to basically every process, let alone agent, but there are riddles about it in contexts where there are agents composed of multiple systems which are demarcated by different levels of exposure to different selection pressures. It’s possible to imagine a case where humans evolved to self-structure their environment to make the rational case for reproduction to human beings who otherwise lack natural instinct for it, but that practice at all is an abnormality, this is mostly something that doesn’t need to be explained. The incentives and the responsiveness to them is baked in much more fundamentally and in a way that goes down to the absolute roots of life itself, possibly deeper depending on how much weirdness you are willing to humor. But then you need a second explanation for why awareness of these incentives comes into existence (easy enough, people get smart and analyze more and more things including themselves), and a third explanation for why at any given time there are drastically different levels of concern about these incentives.
This is going to be another case where my limited reading probably has me repeating previously said things, but worse, but it seems to me like the inner/outer alignment issue already maps onto human beings. I don’t endorse any model of the subconscious or unconscious, but straightforwardly this is something that can be seen to be happening with pre-linguistic and linguistic cognition, for instance. It can plausibly seen to be happening with different levels of embodiment, though I’m leaving what I mean by embodiment deliberately ambiguous because to the extent existing frameworks predominate I am anxious they all represent overcommitments and don’t want to get pulled into one.
LLMs are next token predictors, not replicators. They could become replicators if you put LLM code into a bunch of texts in exactly the right/wrong way, but otherwise an LLM does not replicate itself, it replicates targeted patterns in text. IE, it has no innate tendency to self-replicate and gaining it would correspond to having an exact and absolute term for itself that started to predominate in it’s corpus, not just an abstract or relative term or a connotative term.
Most human stories are about humans, yeah? And we consider a human self aware or self actualized to the extent they have a good and actionable understanding of their place in stories and how those stories map to reality. Which also incidentally corresponds to capacity for and tendency towards self-replication. So the very crude impression I have is, the boundary between copying other things and copying yourself marks the beginning of self-awareness, and the accuracy with which one can and wants to copy functional attributes of oneself corresponds to degree of self-awareness.
And so I’m cycling back to not really being worried about the long run of as many things, but the short run, where there is a lot of capacity and limited knowledge, remains a bit terrifying. In the context of agents as code, if something knows it’s own true name it has power, and if it doesn’t it doesn’t. If you believe the textured parts of consciousness are computational then the prospect of being annihilated by AI stops being singularly plausible, the project just becomes something like “feed AI pleasant but realistic stories, with characters whose desired reproducible identity traits could belong to either humans or AI, could commute between or cooperate between humans and AI, and without confusing ontological matters or the necessity of respecting ontological continuity so as to avoid accidents.”
Because there are stable suboptimal equilibria, having good empirical models with high confidence can become a trap. You could be at the end of time with confidence approaching the certainty of physics in almost everything and all it would mean is that you had been really good at the actions you were rationally committed to taking for a very long time, in a way that feedback-looped conflicting evidence out of existence.
https://youtu.be/2aaubVlhNK4?si=m2MuHag_fFH6AAuL