I find this worrying. If social dynamics have introduced such a substantial freak-out-ness about these kinds of issues, it’s hard to evaluate the true probability of them. If s-risks are indeed likely then I, as a potential victim of horrific suffering worse than any human has ever experienced, would want to be able to reasonably evaluate their probability.
Anirandis
Moreover, this Semiotic–Simulation Theory has increased my credence in the absurd science-fiction tropes that the AI Alignment community has tended to reject, and thereby increased my credence in s-risks.
The potential consequences of this are harrowing—it feels strange how non-seriously this is being taken if there’s a conceivable path to s-risk here. Is there a reason for the alignment community seeming almost indifferent?
What does the distribution of these non-death dystopias look like? There’s an enormous difference between 1984 and maximally efficient torture; for example, do you have a rough guess of what the probability distribution looks like if you condition on an irreversibly messed up but non-death future?
I’m a little confused by the agreement votes with this comment—it seems to me that the consensus around here is that s-risks in which currently-existing humans suffer maximally are very unlikely to occur. This seems an important practical question; could the people who agreement-upvoted elaborate on why they find this kind of thing plausible?
The examples discussed in e.g. the Kaj Sotala interview linked later down the chain tend to regard things like “suffering subroutines”, for example.
I have a disturbing feeling that arguing to future AI to “preserve humanity for pascals-mugging-type-reasons” trades off X-risk for S-risk. I’m not sure that any of these aforementioned cases encourage AI to maintain lives worth living.
Because you’re imagining AGI keeping us in a box? Or that there’s a substantial probability on P(humans are deliberately tortured | AGI) that this post increases?
Related: alignment tax
Presumably it’d take less manpower to review each article that the AI’s written (i.e. read the citations & make sure the article accurately describes the subjects) than it would to write articles from scratch. I’d guess this is the case even if the claims seem plausible & fact-checking requires a somewhat detailed reading through of the sources.
Cheers for the reply! :)
integrate these ideas into your mind and it’s complaining loudly that you’re going to fast (although it doesn’t say it quite that way, I think this is a useful framing). Stepping away, focusing on other things for a while, and slowly coming back to the ideas is probably the best way to be able to engage with them in a psychologically healthy way that doesn’t overwhelm you
I do try! When thinking about this stuff starts to overwhelm me I can try to put it all on ice, usually some booze is required to be able to do that TBH.
But of course it’s also plausible that destructive conflict between aggressive civilizations leads to horrifying outcomes for us
Also, wouldn’t you expect s-risks from this to be very unlikely by virtue of (1) civilizations like this being very unlikely to have substantial measure over the universe’s resources, (2) transparency making bargaining far easier, and (3) few technologically advanced civilizations would care about humans suffering in particular as opposed to e.g. an adversary running emulations of their own species?
Since it’s my shortform, I’d quite like to just vent about some stuff.
I’m still pretty scared about a transhumanist future going quite wrong. It simply seems to me that there’s quite the conjunction of paths to “s-risk” scenarios: generally speaking, any future agent that wants to cause disvalue to us—or an empathetic agent—would bring about an outcome that’s Pretty Bad by my lights. Like, it *really* doesn’t seem impossible that some AI decides to pre-commit to doing Bad if we don’t co-operate with it; or our AI ends up in some horrifying conflict-type scenario, which could lead to Bad outcomes as hinted at here; etc. etc.
Naturally, this kind of outcome is going to be salient because it’s scary—but even then, I struggle to believe that I’m more than moderately biased. The distribution of possibilities seems somewhat trimodal: either we maintain control and create a net-positive world (hopefully we’d be able to deal with the issue of people abusing uploads of each other); we all turn to dust; or something grim happens. And the fact that some very credible people (within this community at least) also conclude that this kind of thing has reasonable probability further makes me conclude that I just need to somehow deal with these scenarios being plausible, rather than trying to convince myself that they’re unlikely. But I remain deeply uncomfortable trying to do that.
Some commentators who seem to consider such scenarios plausible, such as Paul Christiano, also subscribe to the naive view regarding energy-efficiency arguments over pleasure and suffering: that the worst possible suffering is likely no worse than the greatest possible pleasure is good. And that this may also be the case for humans. Even if this is the case, and I’m sceptical, I still feel that I’m too risk-averse. In that world I wouldn’t accept a 90% chance of eternal bliss with a 10% chance of eternal suffering. I don’t think I hold suffering-focused views; I think there’s a level of happiness that can “outweigh” even extreme suffering. But when you translate it to probabilities, I become deeply uncomfortable with even a 0.01% chance of bad stuff happening to me. Particularly when the only way to avoid this gamble is to permanently stop existing. Perhaps something an OOM or two lower and I’d be more comfortable.
I’m not immediately suicidal, to be clear. I wouldn’t classify myself as ‘at-risk’. But I nonetheless find it incredibly hard to find solace. There’s a part of me that hopes things get nuclear, just so that a worse outcome is averted. I find it incredibly hard to care about other aspects of my life; I’m totally apathetic. I started to improve and got mid-way through the first year of my computer science degree, but I’m starting to feel like it’s gotten worse. I’d quite like to finish my degree and actually meaningfully contribute to the EA movement, but I don’t know if I can at this stage. I’m guessing it’s a result of me becoming more pessimistic about the worst outcomes resulting in my personal torture, since that’s the only real change that’s occurred recently. Even before I became more pessimistic I still thought about these outcomes constantly, so I don’t think just a case of me thinking about them more.I take sertraline but it’s beyond useless. Alcohol helps, so at least there’s that. I’ve tried quitting thinking about this kind of thing—I’ve spent weeks trying to shut down any instance where I thought about it. I failed.
I don’t want to hear any over-optimistic perspectives on these issues. I’d greatly appreciate any genuine, sincerely held opinions on them (good or bad), or advice on dealing with the anxiety. But I don’t necessarily need or expect a reply; I just wanted to get this out there. Even if nobody reads it. Also, thanks a fuckton to everyone who was willing to speak to me privately about this stuff.
Sorry if this type of post isn’t allowed here, I just wanted to articulate some stuff for my own sake somewhere that I’m not going to be branded a lunatic. Hopefully LW/singularitarian views are wrong, but some of these scenarios aren’t hugely dependent on an imminent & immediate singularity. I’m glad I’ve written all of this down. I’m probably going to down a bottle or two of rum and try to forget about it all now.
Thanks for the response; I’m still somewhat confused though. The question was to do with the theoretical best/worst things possible, so I’m not entirely sure whether parallels to (relatively) minor pleasures/pains are meaningful here.
Specifically I’m confused about:
Then you end up into well, to what extent is that a debunking explanation that explains why humans in terms of their capacity to experience joy and suffering are unbiased but the reality is still biased
I’m not really sure what’s meant by “the reality” here, nor what’s meant by biased. Is the assertion that humans’ intuitive preferences are driven by the range of possible things that could happen in the ancestral environment & that this isn’t likely to match the maximum possible pleasure vs. suffering ratio in the future? If so, how does this lead one to end up concluding it’s worse (rather than better)? I’m not really sure how these arguments connect in a way that could lead one to conclude that the worst possible suffering is a quadrillion times as bad as the best bliss is good.
I’m not sure if this is the right place to ask this, but does anyone know what point Paul’s trying to make in the following part of this podcast? (Relevant section starts around 1:44:00)
Suppose you have a P probability of the best thing you can do and a one-minus P probably the worst thing you can do, what does P have to be so it’s the difference between that and the barren universe. I think most of my probability is distributed between you would need somewhere between 50% and 99% chance of good things and then put some probability or some credence on views where that number is a quadrillion times larger or something in which case it’s definitely going to dominate. A quadrillion is probably too big a number, but very big numbers. Numbers easily large enough to swamp the actual probabilities involved
[ . . . ]
I think that those arguments are a little bit complicated, how do you get at these? I think to clarify the basic position, the reason that you end up concluding it’s worse is just like conceal your intuition about how bad the worst thing that can happen to a person is vs the best thing or damn, the worst thing seems pretty bad and then the like first-pass responses, sort of have this debunking understanding, or we understand causally how it is that we ended up with this kind of preference with respect to really bad stuff versus really good stuff.
If you look at what happens over evolutionary history. What is the range of things that can happen to an organism and how should an organism be trading off like best possible versus worst possible outcomes. Then you end up into well, to what extent is that a debunking explanation that explains why humans in terms of their capacity to experience joy and suffering are unbiased but the reality is still biased versus to what extent is this then fundamentally reflected in our preferences about good and bad things. I think it’s just a really hard set of questions. I could easily imagine maybe shifting on them with much more deliberation.
It seems like an important topic but I’m a bit confused by what he’s saying here. Is the perspective he’s discussing (and puts non-negligible probability on) one that states that the worst possible suffering is a bajillion times worse than the best possible pleasure, and wouldn’t that suggest every human’s life is net-negative (even if your credence on this being the case is ~.1%)? Or is this just discussing the energy-efficiency of ‘hedonium’ and ‘dolorium’, in which case it’s of solely altruistic concern & can be dealt with by strictly limiting compute?
Also, I’m not really sure if this set of views is more “a broken bone/waterboarding is a million times as morally pressing as making a happy person”, or along the more empirical lines of “most suffering (e.g. waterboarding) is extremely light, humans can experience far far far far far^99 times worse; and pleasure doesn’t scale to the same degree.” Even a tiny chance of the second one being true is awful to contemplate.
- 24 Feb 2022 23:55 UTC; 1 point) 's comment on Open Thread: Spring 2022 by (EA Forum;
we ask the AGI to “make us happy”, and it puts everyone paralyzed in hospital beds on dopamine drips. It’s not hard to think that after a couple hours of a good high, this would actually be a hellish existence, since human happiness is way more complex than the amount of dopamine in one’s brain (but of course, Genie in the Lamp, Mida’s Touch, etc)
This sounds much better than extinction to me! Values might be complex, yeah, but if the AI is actually programmed to maximise human happiness then I expect the high wouldn’t wear off. Being turned into a wirehead arguably kills you, but it’s a much better experience than death for the wirehead!
(I’ve actually read in a popular lesswrong post about s-risks Paul clearly saying that the risk of s-risk was 1/100th of the risk of x-risk (which makes for even less than 1/100th overall). Isn’t that extremely naive, considering the whole Genie in the Lamp paradigm? How can we be so sure that the Genie will only create hell 1 time for each 100 times it creates extinction?)
I think the kind of Bostromian scenario you’re imagining is a slightly different line of AI concern than the types that Paul & the soft takeoff crowd are concerned about. The whole genie in the lamp thing, to me at least, doesn’t seem likely to create suffering. If this hypothetical AI values humans being alive & nothing more than that, it might separate your brain in half so that it counts as 2 humans being happy, for example. I think most scenarios where you’ve got a boundless optimiser superintelligence would lead to the creation of new minds that would perfectly satisfy its utility function.
I’m way more scared about the electrode-produced smiley faces for eternity and the rest. That’s way, way worse than dying.
FWIW, it seems kinda weird to me that such an AI would keep you alive… if you had a “smile-maximiser” AI, wouldn’t it be indifferent to humans being braindead, as long as it’s able to keep them smiling?
I’d like to have Paul Christiano’s view that the “s-risk-risk” is 1⁄100 and that AGI is 30 years off
I think Paul’s view is along the lines of “1% chance of some non-insignificant amount of suffering being intentionally created”, not a 1% chance of this type of scenario.[1]
Could AGI arrive tomorrow in its present state?
I guess. But we’d need to come up with some AI model tomorrow, and this model suddenly becomes agentive and rapidly grows in power, and this model is designed with a utility function that values keeping humans alive but does not value humans flourishing… and even then, there’d likely be better ways to e.g. maximise the number of smiles in the universe, by using artificially created minds.
Eliezer has written a bit about this, but I think he considers it a mostly solved problem.
What can I do as a 30 year old from Portugal with no STEM knowledge? Start learning math and work on alignment from home?
Probably get treatment for the anxiety and try to stop thinking about scenarios that are very unlikely, albeit salient in your mind. (I know, speaking from experience, that it’s hard to do so!)
- ^
I did, coincidentally, cold e-mail Paul a while ago to try to get his model on this type of stuff & got the following response:
“I think these scenarios are plausible but not particularly likely. I don’t think that cryonics makes a huge difference to your personal probabilities, but I could imagine it increasing them a tiny bit. If you cared about suffering-maximizing outcomes a thousand times as much as extinction, then I think it would be plausible for considerations along these lines to tip the balance against cryonics (and if you cared a million times more I would expect them to dominate). I think these risks are larger if you are less scope sensitive since the main protection is the small expected fraction of resources controlled by actors who are inclined to make such threats.”
TBH it’s difficult to infer a particular probability estimate for one’s individual probability without cryonics or voluntary uploading here; it’s not completely clear just how bad a scenario would have to be (for a typical biological human) in order to fall within the class of scenarios described as ‘plausible but not particularly likely’.
- ^
I think the problem is very likely to be resolved by different mechanisms based on trust and physical control rather than cryptography.
Do you expect these mechanisms to also resolve the case where a biological human is forcibly uploaded in horrible conditions?
Lurker here; I’m still very distressed after thinking about some futurism/AI stuff & worrying about possibilities of being tortured. If anyone’s willing to have a discussion on this stuff, please PM!
I know I’ve posted similar stuff here before, but I could still do with some people to discuss infohazardous s-risk related stuff that I have anxieties with. PM me.
a
Evolution “wants” pain to be a robust feedback/control mechanism that reliably causes the desired amount of avoidance—in this case, the greatest possible amount.
I feel that there’s going to be a level of pain for which a mind of nearly any level of pain tolerance would exert 100% of its energy to avoid. I don’t think I know enough to comment on how much further than this level the brain can go, but it’s unclear why the brain would develop the capacity to process pain drastically more intense than this; pain is just a tool to avoid certain things, and it ceases to become useful past a certain point.
There are no cheap solutions that would have an upper cut-off to pain stimuli (below the point of causing unresponsiveness) without degrading the avoidance response to lower levels of pain.
I’m imagining a level of pain above that which causes unresponsiveness, I think. Perhaps I’m imagining something more extreme than your “extreme”?
It is to be expected that humans who are actively trying to cause pain (or to imagine how to do so) will succeed in causing amounts of pain beyond most anything found in nature.
Yeah, agreed.
I’m curious what does, in that case; and what proportion affects humans (and currently-existing people or future minds)? Things like spite threat commitments from a misaligned AI warring with humanity seem like a substantial source of s-risk to me.