I’m bumping into walls but hey now I know what the maze looks like.
Neil
I’m French. Pétard is a very minor swear word, on par with “great Scott!”
It’s not meant as an insult at all. The most common French swear word is probably “putain” (used like “fuck” is) and pétard is used as an attenuated version, (like saying “fudge”).
(As a frenchman, I also admit to the existence of a writhing snake inside my gut telling me to downvote this heretical post which dares! compare French cuisine with German cuisine. Luckily, I have learned enough rationality to override my primal instincts.)
Very insightful post. Here are personal thoughts with low epistemic status and high rambling potential:
These all feel to me like corollaries to the belief “AGI is so important that I can’t gauge the value of anything else except in regards to how it affects AGI”. Hence: “everything else is meaningless because AGI will change everything soon” or “nobody around me is looking up at the meteor about to hit us and that makes me feel kind of insane. (*Cough* so I hang out with rationalists, whose entire shtick is learning how not to be insane)”.
As for other non-obvious effects: I personally feel some sort of perceived fragility around the whole field. There are arguments on this site for why AGI alignment should not be discussed in politics or why attempting to convince OpenAI or DeepMind employees to switch jobs can easily backfire (eg this post for caution advice). These make any outreach at all seem risky. There are also people I know wondering whether they should attempt to do anything at all relative to alignment, because they perceive themselves as probable dead weights. The relatively short timelines, the sheer scope, and the aura of impossibility around alignment seem to make people more cautious than they otherwise should be. Obviously the whole point of the field is to be cautious; but while it’s true that the tried-and-tested scientific method isn’t safe for AGI in general I’m not sure stressing the rationalist-tools solve-problems-before-you-experiment approach is healthy everywhere. So, caution is right there in the description of the field, but you have to make sure you contain it well so that it doesn’t infect places where you would do good to be reckless and use trial-and-error. I am probably quite wrong about this but I don’t see many people talking about it, so if there’s any reasonable doubt we should figure it out.
Alignment work should probably be perceived as less fragile. Unlike the AI field in general, alignment projects specifically don’t pose much of a risk to the world. So we can probably afford to be more loose here than elsewhere. In my experience alignment feels like a pack of delicate butterflies flying together, with every flap of wings sending dozens of comrades spiraling out of the sky, which might or might not set off a domino/Rube Goldberg machine that blows up the world.
Bonus song in I have been a good Bing: “Claude’s Anguish”, a 3-minute death-metal song whose lyrics were written by Claude when prompted with “how does the AI feel?”: https://app.suno.ai/song/40fb1218-18fa-434a-a708-1ce1e2051bc2/ (not for the faint of heart)
Interesting, thanks for posting that! One of the reasons I like this forum is because there are people running around on here who’ve read papers like “Salivary Digestion Extends the Range of Sugar-Aversions in the German Cockroach” and you get to talk to them for free.
So if I understand the abstract and skimmed paper so far, we’re seeing more saliva-based aversion to pure glucose because pure glucose is a superstimulus (the roaches still accept “complex glucose”), and human trap designs are fond of superstimuli, as cheap ways to radically increase the probability your trap works, so the traps are selecting for pure glucose aversion. Given how short insect reproduction cycles are and how many there are anyway, we’ll probably observe this kind of evolution everywhere, as well as every time we switch traps.
Hello! To support your point, I think entomology is particularly fascinating as a window into how evolution works because of how many niches there are in the micro world. “Amazon rainforest” is more or less a single biome for macroscopic humans. But for insects, a whole new set of dimensions are navigable and there are substantial differences between, say, tree X, pond Y, canopy W or river V. When you’re small, things aren’t just bigger; there’s also a lot more variety because what look like subtle differences to us (like whether a tree is wet or not) is a huge difference for small buggers (water is sticky at small scales, and can drown insects).
This is to say that there aren’t just two dimensions in the small world; some spiders operate on one dimension, other spiders operate on an other dimension, and of course 12,000 different species of ants all have their own way of integrating whatever niche dimensions they operate in.
You can continue your way downward, by the way: the world of unicellular organisms is incredibly dense and varied, and there are way over a million clearly identifiable bacterial species alone.
So when you describe the nature of traps, there are probably thousands of effective space translation constructions, beyond just 2D and 3D.
I mention the Amazon specifically because that part of nature is a hellish death zone, where insects genocide each other every other Tuesday while odd inventions like door-shaped-ants, zombie-ants fungus and worryingly intelligent trial-and-error capable spiders like Portia (technically not native to the Amazon) pop up all the time before getting out-tactic-ed by some other horridly violent species. There is a lot of small-button-pressing going on here, and new tactics that let you explore a new dimension are so effective that “one fell swoop” strategies are common.
It’s like in Worm where every time the protagonists are up against a person with a new power, they nearly die, because the effect of surprise is just that powerful. When you live in a world with superpowered individuals, capability distribution between humans is uneven; the same can be said for the level of variance between species of ants, for example. In both cases, individuals have access to space translations very different from those of their enemies, which is why surprise and the inability to adapt are common observations.
More French stories: So, at some point, the French decided what kind of political climate they wanted. What actions would reflect on their cause well? Dumping manure onto the city center using tractors? Sure! Lining up a hundred stationary taxi cabs in every main artery of the city? You bet! What about burning down the city hall’s door, which is a work of art older than the United States? Mais évidemment!
“Politics” evokes all that in the mind of your average Frenchman. No, not sensible strategies that get your goals done, but the first shiny thing the protesters thought about. It’d be more entertaining to me, except for the fact that I had to skip class at some point because I accidentally biked headfirst into a burgeoning cloud of tear gas (which the cops had detonated in an attempt to ward off the tractors). There are flagpoles in front of the government building those tractors dumped the manure on. They weren’t entirely clean, and you can still see the manure level, about 10 meters high.
FHI at Oxford
by Nick Bostrom (recently turned into song):the big creaky wheel
a thousand years to turnthousand meetings, thousand emails, thousand rules
to keep things from changing
and heaven forbid
the setting of a precedentyet in this magisterial inefficiency
there are spaces and hiding places
for fragile weeds to bloom
and maybe bear some singular fruitlike the FHI, a misfit prodigy
daytime a tweedy don
at dark a superhero
flying off into the night
cape a-fluttering
to intercept villains and stop catastrophesand why not base it here?
our spandex costumes
blend in with the scholarly gowns
our unusual proclivities
are shielded from ridicule
where mortar boards are still in vogue
I’m glad “thought that faster” is the slowest song of the album. Also where’s the “Eliezer Yudkowsky” in the “ft. Eliezer Yudkowsky”? I didn’t click on it just to see Eliezer’s writing turned into song, I came to see Eliezer sing. Missed opportunity.
I have completed the survey! Whohoo!
I’m not convinced. I felt the training video was incomplete, and the deadline too short.
All the obvious alternate routes to participating to the alignment problem seem to have been mentioned here—are there any more I should write down? I’m aware this is a flawed post and would like to make it more complete as time goes.
Too obvious imo, though I didn’t downnvote. This also might not be an actual rationalist failure mode; in my experience at least, rationalists have about the same intuition all the other humans have about when something should be taken literally or not.
As for why the comment section has gone berserk, no idea, but it’s hilarious and we can all use some fun.
Alright I’ll try this methodology:
I often tell myself that Robutil’s reign is temporary. That there’s a monkey inside of me and I must listen to Humo to keep it in check, but only because I’m following Robutil’s plan (which is not ideal in the long run). The local scope and often misleading altruistic instincts that the Humor/the monkey offers are suboptimal when Earth is under siege and you have to put a price to human life. Our evolutionary-derived altruism is often, as in the case with scope insensitivity, plain wrong. But I still have trouble taking genuine solace in Robutil’s world, even if he is right. He seems like a temporary ally I must begrudgingly follow until we can live in a world in which Humo’s altruism aligns perfectly with reality.(Which has never happened before, given how altruistic genes were technically optimizing for reproduction, not altruism, and that showed.) Robutil seems to me like a scaling tool in Humo’s toolset and not the reverse.
Anyhow thanks for posting this, the monkey appreciates anthropomorphization.
Yep. The worst thing I’ve seen romanticized in my milieu though is poor mental health: for some reason it’s quite “cool” to say you’re depressed all the time, and while I know some of my friends are actually genuine, I’m not comfortable with the social pull toward the depression aesthetic. It’s so weird.
The law of headlines is “any headline ending with a question mark can be answered with a no” (because “NATION AT WAR” will sell more copies than “WILL NATION GO TO WAR?” and newspapers follow incentives.) The video here is called “will superintelligent AI end the world?” and knowing Eliezer he would have probably preferred “superintelligent AI will kill us all”. I don’t know who decides.
See also Alicorn’s Expressive Vocabulary.
I’d be worried that “identifying as having a flaw” can begin as something merely aesthetic at first and end up deeply warping your lens of the world in the end. If “is known as the pessimist” or “is known as the optimist” becomes something you start wanting to live up to, I feel like you could do real damage to your empirical rationality. I guard myself against “aesthetic flaws” personally because they don’t seem to be worth the amount of course-correcting I’d have to engage in just to keep an accurate-enough view of the world. I added a footnote though, thanks for the feedback.
Thanks! I’ve personally been using this to force myself out of procrastination! I just give high probability Y to “finishing work X by the end of the day” and then, to preserve my precious brier score, I end up… doing it, most of the time. “The best way to predict the future is to shape it” actually applies very well to to-do lists.
Poetry and practicality
I was staring up at the moon a few days ago and thought about how deeply I loved my family, and wished to one day start my own (I’m just over 18 now). It was a nice moment.
Then, I whipped out my laptop and felt constrained to get back to work; i.e. read papers for my AI governance course, write up LW posts, and trade emails with EA France. (These I believe to be my best shots at increasing everyone’s odds of survival).
It felt almost like sacrilege to wrench myself away from the moon and my wonder. Like I was ruining a moment of poetry and stillwatered peace by slamming against reality and its mundane things again.
But… The reason I wrenched myself away is directly downstream from the spirit that animated me in the first place. Whether I feel the poetry now that I felt then is irrelevant: it’s still there, and its value and truth persist. Pulling away from the moon was evidence I cared about my musings enough to act on them.
The poetic is not a separate magisterium from the practical; rather the practical is a particular facet of the poetic. Feeling “something to protect” in my bones naturally extends to acting it out. In other words, poetry doesn’t just stop. Feel no guilt in pulling away. Because, you’re not.
A functionality I’d like to see on LessWrong: the ability to give quick feedback for a post in the same way you can react to comments (click for image). When you strong-upvote or strong-downvote a post, a little popup menu appears offering you some basic feedback options. The feedback is private and can only be seen by the author.
I’ve often found myself drowning in downvotes or upvotes without knowing why. Karma is a one-dimensional measure, and writing public comments is a trivial inconvience: this is an attempt at middle ground, and I expect it to make post reception clearer.
See below my crude diagrams.