pearls before swine >:[
mike_hawke
What are your thoughts about this objection to evals? Have you already addressed it somewhere?
if we can squash every scary AI that is not quite smart enough to do a treacherous turn, and we don’t structurally eliminate treacherous turns, then the first deployed AI that causes major damage will do it via a treacherous turn. We have no warning shots.
Less importantly, why are these things in quotation marks?:
[...]doing things like “incorporate a company” or “exploit arbitrages in stock prices” or “design and synthesize DNA” without needing any human assistance or oversight.
A prairie is qualitatively different than a billiard table or an asteroid belt: If you tried to use basic kinematics and free body diagrams to describe a prairie ecosystem, you would find that most of the interesting action was left unexplained. To handwave away air resistance and viscosity is to handwave away all the birds. To handwave away friction is to handwave away basically every other mobile life form. And I think it only gets worse if you move from a prairie to a rainforest—floating spores, flying snakes, geckos, soft but breakable eggs, all manner of sticky appendages, etc.
Simple dynamics don’t even get you a decent first approximation of these systems unless you zoom way out and take very coarse averages. (“The biomass generally stays within roughly 10m of ground level, because of gravity.” “These tightly coupled populations of predators and prey roughly trace out this orbit in phase space every X time interval.“) (But I’m interested in counterexamples if you have them.)
...
Anyway, this feels related to the fact that we can’t develop good models for human interactions, either descriptive or prescriptive. When I try to do virtue ethics, I find that all my virtues turn to swiss cheese after a day’s worth of exception handling. When I try to take actions based on first principles of game theory I end up feeling like a maladjusted sociopath. When I try to incorporate the good parts of economic/evopsych cynicism into my view of human affairs, I end up with more questions than answers.
…the question does sometimes haunt me, as to whether in the alternative Everett branches of Earth, we could identify a distinct cluster of “successful” Earths, and we’re not in it.
— This Failing Earth, Eliezer YudkowskyDoes anyone else wonder similar things about the EA/rationality scene? If we could scan across Tegmark III, would we see large clusters of nearby Earths that have rationality & EA communities that embarrass us and lay bare our own low standards?
I wonder if this post would have gotten a better reception if the stooge had been a Scientologist or a conspiracy theorist or something, instead of just a hapless normie.
I assume that the whole flat earth thing will lose its contrarian luster and fall out of style in the next few years. But suppose that’s wrong. How soon until there are significant numbers of flat-earther kids enrolling in kindergarten? Will they be like existing fringe religious minorities? Will they mostly be homeschooled? My real best guess is that flat-earthers don’t have kids so this won’t happen.
Some smart, scrupulous, rational news junkie should write a periodical report on the state of anti-epistemology. I sort of worry that memeplexes, including anti-epistemic ones, have tipping points whereat they become popular (or dominant) very suddenly.
I followed a link to an article about how Facebook was used to facilitate a genocide in Myanmar. I got a few paragraphs into it and then thought, “Wait, the New York Times is telling me a scandalous but murky story about Big Tech and world events...and I’m just condensing that as ‘known facts of public record.’ Isn’t this Gell-Mann amnesia?”
So then I felt myself searching for reasons why the NYT could be trusted more about this kind of thing, but found it difficult to come up with a single specific reason that I actually believed. So then I supposed that it was worth reading anyway, since the basic facts were important, and I wasn’t at that much risk from whatever biased framing the NYT might take. But I realized that I didn’t really believe that either—I imagined the future in which I turn out to have been utterly misled by the article, and that hypothetical future felt entirely plausible.
So I didn’t read it.
It was an effortful and unrewarding decision, but I endorse it, and I’m hopeful that it will be easier next time. For news stories of this sort, I expect to fall short of my own epistemic standards unless I check 3 or 4 diverse sources. But I didn’t want to do an hour of responsible research, I wanted to spend a leisurely 10 minutes on a single, highly consumable, authoritatively-voiced article and then enjoy the feeling of being informed.
Uh oh, do you really leave the news playing in your living room all the time? Don’t you know it’s corrosive to your epistemics and agency? Plane crashes are overrated and chronic stress is underrated!
This is pretty much my default attitude, but...SSC once wrote that smoking possibly mitigates schizophrenia, and that “[t]his should be a warning to anyone who’s too quick to tell patients that their coping strategies are maladaptive.”
News does have those downsides, just like smoking does cause cancer. But it’s good to remember that load-bearing bugs are the rule, not the exception.
What good thing happens if you read The Sequences?
You see repeated examples of rigorous thought about slippery topics, very deliberately setting up the seductive cached answers and then swerving away from them.
Exposure to a lot of carefully applied Transhumanism. Mostly in Fun Theory but also sprinkled throughout. The transhumanism is sincere and often emotionally charged, not just smug philosophical gotchas.
The concepts & jargon are really useful. Yeah, jargon has its downsides, no doubt, but it is still overwhelmingly net positive.
A thorough argument that you really can live a life that integrates philosophical curiosity, narrative satisfaction, frolicking artistry, vigilant truth-orientation, deep emotion, scientific rigor, and a childlike hope for The Good.
Deep and unforced optimism. Cynicism about cynicism[1]. Generalized anti-nihilism.
Friendly AI is permitted by the laws of physics. This is sufficient reason to try our best, even if it turns out to be too difficult for tiny mortals like us.
We’re a thousand shards of desire lashed together by evolved kludges. Human-compatible morals & æsthetics exist only in humans, and are not objectively special. So what are you gonna do about it?
Unicorns aren’t real, there is no god, and no pixies in the garden. But you know what is real? Giant squid. Electric eels. Radiotrophic fungus. Black holes. Volcanoes. Aircraft. Flamethrowers. SCUBA diving. Lightning, rainbows, aurorae. If you really think you could enjoy unicorns and levitation spells, there is no reason why you shouldn’t also be able to take joy in the merely real.
[1] Cynical About Cynicism isn’t in the Sequences, but the same general attitude still comes up.
When i’m walking around through my daily life, it helps me to think of myself as a character in a cyberpunk weirdtopia.
Phone anxiety ruining my nature walk? Yeah that’s cyberpunk, even if it wasn’t anticipated by Neuromancer.
Strolling over to the donut shop for a nice pastry...amid a bungled global health crisis of disputed origin? Yup, that sure counts.
Detouring down a beautifully verdant neighborhood, past a consecution of strident culture war yard signs, presumably influenced in some part by foreign psyops like the IRA? Definitely cyberpunk.
Scratching my head over the risks of cryptocurrency hodling vs the risks of pandemic-driven inflation? Cyberpunk af.
Hiro Protagonist, the protagonist of Snow Crash wouldn’t complain about these things; he would go on a sassy, sciencey, poetic monologue about it and appreciate it all for what it was.
I like this post; words are important.
I certainly want something that means Tolkningsföreträde, that sounds quite useful.
Maybe also Föreställningsvärld—it sounds like it isn’t quite interchangeable with “worldview”, and I find that “world model” sounds too technical.
I love “microdictator” and I’m going to try to spread it.
I’m not so sure about the rest. It seems like “caricature” and “mission creep” might be fine.
[Linkpost] 7 Swedish Words to Import
Often I have witnessed people encountering new information, apparently accepting it, and then carefully explaining why they are going to do exactly the same thing they planned to do previously, but with a different justification. The point of thinking is to shape our plans; if you’re going to keep the same plans anyway, why bother going to all that work to justify it? When you encounter new information, the hard part is to update, to react, rather than just letting the information disappear down a black hole.
In some contexts, this is exactly right. It is right and proper to see major, real-time belief updates in the climax of a rational fic. And one hopes that executives in a high-stakes meeting will be properly incentivized to do the same. But in many ordinary cases, the most extreme concession one should hope to hear is, “okay, you’ve given me something to think about,” followed by a change of subject. (If this seems unambitious, consider how rarely people make even such small concessions.)
I think it’s important to mind the costs—both psychological and social—of abruptly changing one’s plans or attitudes. “Why bother going to all that work to justify [staying the course]?” Indeed, I wish it were more normal for people to say, “well, that’s a good point but it’s probably not worth the switching costs” or even just, “I don’t feel like thinking that hard about it.”
… yesterday, you said Q, and Q implies not P. So you were wrong yesterday or today. So you’re wrong.
I sort of want to try developing (the valid version of) this into a deliberate skill. I think that of all the mundane forms of hypocrisy, one of the most vexing might be inconsistency at the 24h+ timescale. It’s just hard to say, “hey, you’re trying to have it both ways!” if the violation in question is spread out over multiple days. So naturally, everyone does it all the time.
“Top Forecasting Team Says World Population in 2050 Will be Only Six Thousand!” there’s a good chance that they will just write “Top Forecasting Team Says World Population will massively decrease in the middle of the century”.
Ok that’s probably true. This idea was meant mostly as a joke, but still...I can’t help but wonder if there might be some cool Straussian tactic to push a tiny signal through the Great Distorter.
Yeah sounds right. Post edited.
Some ideas for interacting with reporters
Five Missing Moods
Once again, I do declare: the the world could really use at least 10 more John Nersts.
Important but frustrating rationalist skill: getting halfway through a comment and then deleting it because you realized it was wrong