Mostly I think that I could just do things but I am very bad at finding ways for those things to earn me the money I need to stay alive. So I spend most of my time doing other things (and sometimes despairing at how badly those things are being organised/managed and how much effort is wasted by this).
dr_s
Pretty much. There is an argument for “if you can’t provide a better option, don’t take away what people have out of self righteousness”. It’s hard to thread the needle exactly. But if we make everything for sale and everything available as long as there’s some marginal gain to be made, all that happens is that Moloch eventually grinds away all margins and eventually even the best option is pretty bad. Not in this case probably tbf (I don’t think Qatar cares too much about what westerners thinks and their salaries are entirely determined by how much they need to be to attract the workers they need, so in this case the argument is probably kinda fine). But overall, yeah, you would quickly get in loops such as “allow people to sell organs if so they choose → now everyone in the poorest tiers of society HAS to sell at least one kidney to survive”.
Maybe the main lesson is that it should be made more clear that constraints aren’t for the sake of the specific people who may or may not be impacted by them right now, but for the sake of shoring up principles to progressive erosion.
Good post! I generally agree with the general thrust to be more flexible in what counts as a rational behaviour, but about one specific example of Allais paradox, do we even need to abandon EUT to explain it? In practice, the bet involves money, and if we do have a utility function, it isn’t “money” at all; it’s whatever psychological benefits (safety, comfort, pleasure) money can afford us.
Now on one hand we have five millions compared to one million, and we know the growth of the utility of money is somewhat sublinear—the first million is a lot more valuable to us than the second, as we have a hard floor of physical needs. On the other we have the fact that, depending on who we are, getting nothing might cause us significant distress—regret at a missed opportunity, guilt towards our dependents. So that zero really is a quite large negative number, potentially. The same regret isn’t there if our choice didn’t really give us a certainty option. So in that framework, the use of money is what’s misleading—it doesn’t rule out a real underlying utility function, it’s just not affinely equivalent to it.
From what I can see, as long as it’s not paired with allow-same-origin it should be safe enough. My thinking was more along the lines of what happens if e,g. there’s an exploit or such. It’s just not the kind of thing that I expect I need to be wary of from a LW post, that it may execute random unvetted code in my browser.
Everyone is going on about the LLM block, meanwhile I’m like “isn’t letting users inject arbitrary JS in their post kinda dangerous?”.
I think that’s not very different from ordinary bad research. People now can use bad LLM info like before they could Google badly or refer to some low quality source. Ultimately the failure mode is still “the writer failed to do due diligence”.
Officially the most accurate robot in fiction is now Marvin the Paranoid Android.
I wonder if it’s correlated to the other common trait of Gemini/Gemma models, which is being deeply sycophantic. It might be a case of, they’re trained to please so hard, that getting stuck unable to actually deliver what asked (or even worse, being gaslighted into believing they are) will make them crash out. They have House Elf mindset.
I’m not sure what are the safe plays at this point, especially if you think that existential risk is on the line. But I also don’t think almost any of them are like, completely devoid of “enmity”. Enmity makes people act hard and fast. And it would in fact be legitimate to feel enmity for someone who is risking your life or trying to take away your position of power by scheming. I don’t think it’s good to be some kind of Machiavellian schemer as a first resort but I also don’t think you can or should confront problems on this magnitude and with these stakes while tip toeing around to make sure that you never suggest anyone should be anyone else’s enemy. At some point, you call a spade a spade. If you can’t bring yourself to saying that maybe the people who want to do the thing that kills everyone are the enemy, no one is going to believe you actually think you’re really as concerned as you say you are.
Simply put, when groups of humans and/or machines have a high prior that they need to destroy each other in order to achieve their goals, they are more likely to do that than if they have a high prior on being able to find mutually beneficial arrangements. And, there are things you can do to increase or decrease that prior.
To be fair, I don’t think that Eliezer’s tweet to someone called the Secretary of War is the biggest contribution to anyone’s impression that destroying each other is a way to accomplish goals. I am not sure what the issue is and I think Yud’s position is roughly correct, though I also worry about getting the admin into this because my model of them is that they would just be the kind of guys who want the eldritch power for themselves, confident that they’re just Born Different and won’t be eaten by it like everyone else.
Thematically I think Heart of the Machine would be an appropriate mention: it’s a strategy/RPG game about being a fooming AI. However in terms of game mechanics it’s a bit of a mess.
Almost all Paradox games would probably deserve a mention for being complex systems you have to actually interact with and understand. Shout out to Stellaris as the most accessible one and the one with all sorts of transhumanist stuff in it.
Arguably true, but I think there’s a case to be made that sincere kumbaya hippie-ism that’s inoffensive to everybody is more likely to succeed than a more cynical ideology that uses it as a facemask, and is willing to write off its enemies foreign and domestic as adversaries that it’s okay to run the trolley over.
To a point, but I don’t know if “just pull off essentially a worldwide cultural coup by being fast enough to avoid the supervision of any existing political mechanism—for the sake of forever peace and goodness” can be construed as unambiguously ethical either. It sounds more like one of those well-intentioned crazy comic book villain plans that always end bad, and has a decent chance of doing that (a misaligned well-intentioned all-powerful ASI could be a huge S-risk). It can still be construed as virtuous, a final rebellion attempt against a baked in social and political order that one considers fundamentally immoral and unfixable—but it is still an act of rebellious subversion, not just a nice peaceful thing to do.
Supposing I’m a Chinese military strategist, I’m much less likely to sound alarm bells over the risk of an American firm building world-dominating AI if that firm has not enthusiastically offered to use its AI to fight my government. Supposing I’m a Republican staffer, I’m much less likely to encourage a scorched-earth approach to bring a contractor to heel if that contractor has actively tried to prevent its systems from discriminating against my constituents.
Anything that explicitly performs tolerance—as Claude does—comes already across as inherently partisan and offensive to some sides. In fact probably a big part of why what happened, happened. Not everyone is just happy to live and let live, some think that if your AI isn’t actively promoting their mindset then it’s not good enough.
I am not sure there ever was a way to tackle all of this together. Obviously “the AI does what we want at all” is the prerequisite to anything else, and we don’t even know if we have that down pat (especially if it gets smarter). But also “bake your specific humanistic tolerant value into the AI before anyone notices so when it fooms they’re forced to deal with a nice genie that won’t obey evil orders” was obviously always very naive as far as plans go. What else? Don’t build AI at all, probably, which in itself would require ugly and likely repressive methods. Or I suppose hope you can at least keep AI tethered to the way the current institutions work, so everyone gets a force multiplier of sorts but balance persists… I would call that a pipe dream too. Honestly I just think what we see is the flailing about of many people tackling different angles of a fundamentally unsolvable tangle of problems and all accusing each other of not seeing the real problem when they’re all real.
Even AI savvy people aren’t a united front on how high the risks of loss of control are. If alignment isn’t all that hard then decentralised, open AI is actually better over the risk of AI driven authoritarianism.
I don’t think it’s just that. The classic case would be, maybe you’re just a humble academic researcher publishing papers on interpretability for no financial reward. But then someone uses your findings to turbocharge capabilities instead and also makes a ton of money off it. That feels like adding insult to injury.
I very much agree with the sentiment, and feel like this points at one of the biggest dilemmas out there:
to wield great power is necessarily to be able to do great harm, either by accident, by negligence, or on purpose. And there is an argument that to even want great power is insane—as you said, it requires self belief beyond what is justifiable. The correct amount of confidence in one’s own sanity and beliefs is less than what is required to take on the typical risks of founding a business.
But also, to not wield power inevitably leaves the ones who are that kind of crazy to do so, and even trying to stop them is itself a feat of power.
I guess the only answer would require political power wielded by some kind of emergent collective intelligence but we all know how well that tends to pan out too.
So if I can suggest with some introspection (not that I’m particularly involved in AI either way except as a user—I don’t work on that) about possible reasons why this kind of mindset exists:
a part of my mind finds the arguments for AGI/ASI doom quite solid and believable. I am skeptical of classic Yud-style fast takeoff/FOOM scenarios but that does not mean I think that letting AI control civilizations would end well.
But another part sees how many people think this sounds patently absurd, impossible and crazy and has to wonder, well, what if I have indeed a completely skewed perspective and this is absurd, impossible and crazy? And my main problem with this is that any measures I can think of to thoroughly stop development of AGI would require some kind of control on hardware, software and possibly individual privacy that I am also very uncomfortable with, right in a time in which I am growing more and more suspicious of what governments all across the world seem to want to do to technology and the internet. So the thought of helping that along for no good reason is very repulsive to me.
Basically, I see that on expectation there’s a path that yes, is probably the least bad. And rationally is the one that’s possibly sensible to pursue. But also that I may be wrong, and I simply can’t much muster enthusiasm for pursuing it because given the current premises and political direction of the west I feel like it will also result in a world that I hate to live in. And as often see politically, while “lesser evil” is a reasonable thing to endorse, it ends up still losing because it still can not rouse anyone’s enthusiasm.
BTW this makes me think that Japan has produced just about the only example of media I have seen in which an enthusiastic youth wanting to revitalise the little declining business they love ends with their dream crushed and them forced to accept reality and move on (if anyone wonders, it’s the anime The Aquatope on the White Sand).
Italian here. I feel like the core issues are probably on point, with the details differing from country to country. For us, our economy has been quite bad for decades now. Work is scarce, poorly paid, often within stultified companies ruled by an idiotic behind the times managerial class (one thing Italy shares with Japan: being a gerontocracy). People don’t achieve economic stability until well in their thirties, if ever. Some jobs, like teacher, require absurd humiliation congas (I know of people who kept being substitutes and temps, constantly travelling for each role, well into their fifties, while climbing up the Byzantine public scoreboard to a fixed job). It’s just not particularly surprising that fertility craters. Mostly I would say Italians in Italy end up working very hard to achieve very little which is simultaneously tiring and demotivating.
You say that as if gender transition was an uncontroversial idea that everyone is on board with.