Ooh very interesting. I fall for throat-clearing a lot myself. I do think it’s kinda hard to avoid context and overview, though, and intros can be nice stylistically? But maybe this is just me making excuses for my bad habits...
Harjas
Some conjectures:
General cultural ossification: caused by widespread recording and information technology devices.
Cultural splintering: things change so fast nowadays that there simply isn’t time for new symbols to catch on before they’re replaced by even newer symbols (and culture is also so fragmented that there isn’t really a single power center that can take over).
Standardization: now that symbols tend to be standard across international communities, changing things might require a LOT of coordination.
Low hanging fruit has already been plucked: potential improvements (if they exist at all) might be minor and not worth overhauling already-standardized systems.
Symbols are still developing: how long has the like button been around? Or the karma system? Maybe we’ll keep innovating our symbology as new needs arise.
I wish he’d given some common failure modes and ways to fix them. Like I completely agree with the main point, but without concrete examples, I have a hard time applying this advice to my own writing except for “try harder dingus” which is often unhelpful.
Gotcha. Is there a strong reason to assume that we’ll succeed at creating AIs that can be pointed at a single target? I read this post and comment a while back and would love your thoughts.
I’m a little surprised by the amount of disagree reacts, given that no one has replied.
I keep running into conceptual confusion around the term “alignment,” particularly when reading older Less Wrong posts. Some people say “aligned AI” and mean “an AI that works for human flourishing,” some people say that an AI “is aligned” if it reliably advances the intended objectives of some person or group (and doesn’t have some secret set of goals / isn’t scheming), and yet other people use “alignment” to mean something along the lines of “the ability of any system to reliably work towards some pre-defined goal.” I usually have to work out which is being said on the spot, which is annoying given that the implications of each are very different.
Is there one commonly accepted definition? Is this confusion just a thing we’ve all accepted?
A bit of a necro-comment from me, but I’m reading this about four years later and am very surprised that this is the first time I’m hearing about the concept. I can’t think of anything in the intervening time period that has either confirmed or deconfirmed this comment, let alone even engaged with it.
For the record, I think this is helpful and will be stealing it for any future advice posts I might write!
Has anyone made a proper post about potential “warning shots” and how we should prepare for them? This post has lived rent-free in my head for the past couple of months and I’m curious to know if anyone else has been thinking about this topic too.
Re character: I think most Americans (including myself) have been so far removed from true corruption that we have forgotten how bad it can possibly get. Even my state of Illinois, which is notable for its historical machine politics and general corruption (4 of our 11 last governors serving time + many others like Mike Madigan), has still more or less seen forward progress, because the corruption wasn’t bad enough to completely erode politics in the state.
But it CAN get that bad. We’re seeing this now with the Trump admin. I am generally left-leaning, but at this point I think I’d take an honest Republican over a corrupt Democrat—a position I did not hold previously—because corruption eats policy and utterly erodes the foundation upon which we build fair markets and strong institutions.
Thought in progress: epistemic humility is not a substitute for actual humility (or professed humility). You only get to cry wolf once, but you can probably warn about potential wolves several times—so long as you don’t burn goodwill on an incorrect or overconfident prediction.
I think epistemic humility helps to increase trust and confidence in EA/Less Wrong-type spaces, but I think professed humility is far more helpful when it comes to public-facing AI comms, particularly as scenarios get more intense and specific (e.g. prefacing AI doom predictions with a decent amount of throat-clearing beforehand commensurate with the intensity and specificity of the forecast). For example, I think that AI 2027 might have been better received if the authors had spent less time trying to convince the readers of their credibility at the beginning and spent more time saying something along the lines of “we know this sounds crazy and are well aware of how sci-fi the scenario seems”. (I’m not a huge fan of lampshading in fiction, but IRL, I think you do need to display self-awareness of outlandishness in order to be taken seriously, particularly if what you’re predicting sounds insane to the average person.)
Of course, there are huge diminishing returns on this: the more throat-clearing you do, the less confident you seem. And throat-clearing should probably be saved for public-facing comms, because actual technical work seems to require people who are confident in their beliefs even when they are outlandish (as proven by the outlandish explosion of AI progress recently).
Still, I think that the AI safety community at large has a worse reputation than they deserve, and I think part of that is due to the appearance of overconfidence. This problem seems simple, tractable, and important.
Random future reader from 14 years in the future: seconded. And also, why didn’t they just use i.e.?
I don’t think that the belief that godlike intelligence is necessary for human extinction via AI is a popular AI doomer position among people who are intellectually sophisticated. It’s more like those people hold complex position and it’s easy for people who are skeptics to frame this as “a popular position”.
Hang on, I don’t think I said that godlike intelligence was necessary for human extinction, and actually, didn’t make any claim about human extinction at all. This post was just about the possibility of an intelligence explosion, and I think “AI will reach godlike levels of intelligence” is an accurate description of the AI 2027 position.
You can’t conclude from the fact that inference scaling happened that most AI improvements are due to scaling.
Did you read the cited link that you quoted? Toby Ord’s argument was pretty convincing to me. What do you disagree with?
When it comes to inference it’s also worth noting that they found a lot of tricks to make inference cheaper. It’s not just more/better hardware
Right, ending in about late 2024, which is why I specified (~late 2024) in “most recent gains”. It doesn’t seem like that trend has continued.
Re misframing: fair enough. Maybe I should have said “a popular AI doomer position”.
On the other thing: I’m not quite sure what you mean? My thesis in the quoted text was basically what I said: since most AI improvements have come from inference scaling, aka scaling up compute requirements, we can expect that future progress will also come from scaling up compute requirements. Obviously this only holds true until another paradigm shift happens.
Do you think agents will be trained on themselves in a similar fashion to AlphaGo, and do you think that training will reduce compute requirements / provide a performance increase driven by training instead of inference?
I imagine a grounded career would be one in which you can see a direct link between your actions and their results e.g. careers in medicine. Strangely enough, Earning to Give seems to fit this criteria (provided you give to a grounded charity like GiveDirectly or something). I’m not necessarily endorsing this philosophy though—I think there are a lot of important and impactful careers that have really loose feedback loops and uncertain outputs, like journalism or science communication.
Even if the efforts of any particular person are likely to go to waste, if a large number of people follow the strategy of “do things that have a small chance of being hugely beneficial on net”, this may turn out to be more impactful than if they all tried to maximize their chance of making an individual impact.
As a counterpoint, I’m reminded of this post from the EA Forum:
The best cause will disappoint you: An intro to the optimisers curse
in which the author argues against naively trying to maximize for expected value and argues for choosing grounded causes over speculative causes instead. Curious for your thoughts on this.
Has anyone written an essay about how to fight against/correct for Trapped Priors? I would like to do something like that, but I want to make sure that I’m not reinventing the wheel here. Thank you!
This is a great point. Making the “breakthrough” from that poster’s meditation retreat last is less about maintaining a single realization and paradigm shift, and more about distilling the 8+ hours of daily meditation into a single 5-30 minute daily practice that still confers the majority of the benefits. Instead, as you point out, people end up chasing the feeling of finding a revelation over actual progress.
Before enlightenment, chop wood, carry water.
After enlightenment, chop wood, carry water
As I understood it, Paul’s initial role in the story during Dune and Dune: Messiah was one of being coopted by all the great forces playing out around him. It’s a very sneaky framing—Herbert makes him seem like one of the Great Men of History, but as he futilely realizes towards the end of the first Dune novel, his life or death during the final fight in the throne room would not have changed anything. Had he died, he would’ve been the martyr who sacrificed himself to free Arrakis from the Harkonnens, and the Jihad would be carried out in his name. Since he lived, the same thing happened (save for the martyring).
Dune 1 and Dune: Messiah are a deconstruction of the hero and his “agency” in a world that is governed not by individual choices but by sociology. Religion, the Bene Gesserit, Herbert’s views on genetics, the Landsraad and the wars of the Great Houses. All of these factors drove the story into being what it was. The superhuman power of prescience allows one to see possible futures, but there are still only very few things that are possible. Despite Paul’s superhuman capabilities as the pinnacle of the human species, he was still subject to the forces of history. Other timelines involved him maintaining the status quo by dying before becoming Muad’Dib, and some other highly ignoble ways of preventing the Jihad, but none of them gave him the revenge he wanted—and he could not find a way to exact revenge on the Harkonnens without unleashing Holy War on the universe.
And so he did.
By Children of Dune, Herbert changes his mind a little. Paul himself was unable to forge the course of history, because he was still so very human. The peak of the human species was still subject to our limitations. Leto II, on the other hand, is the only member of any truly alien species in the Dune series, because he merged with the worms. Doing so allowed him to break free from the chains that bound the rest of us. At this point, Paul was also retconned into having seen the Golden Path but simply lacking the strength to pursue it (which I was also fine with tbh given the previous statements on Dune 1 and Messiah—Paul’s character trait of lacking the will to undertake truly horrific decisions is not a retcon, but the existence of the Golden Path is). Leto II was willing to unleash the forces of Jihad at his own command, and reshaped the world as God Emperor.
And yet, the God Emperor was not really free either. Just like Paul after the Stone Burner, he saw a vision, assumed it to be the only way, and fully gave into it, and from then on outward simply walked the path laid out for him by himself. Ironically, his choice to eradicate the very concept of prescience freed every other future Kwisatz Haderach from the burden he and Paul carried: a false total understanding. I can’t remember where this happens, but Herbert likened prescience as collapsing the wavefunction of the future a la the Copenhagen interpretation of Quantum Mechanics, but cautioned that doing so also meant eradicating other possible futures that were as of yet unseen (truly Herbert was the GOAT this shit ruled). Prescience was the very thing Paul warned against:
“And always, he fought the temptation to choose a clear, safe course, warning ’That path leads ever down into stagnation.”
But it seems that all the great prescience-users fell victim to the trap. One could argue that Leto II did this on purpose, forcing humans into 3,500 years of stagnation in order to teach them “a lesson their bones would remember.”
I give them enduring eons of enforced tranquility which plods on and on despite their every effort to escape into chaos. Belleve me, the memory of Leto’s Peace shall abide with them forever. They will seek their quiet security thereafter only with extreme caution and steadfast preparation.
And yet, Leto’s peace never seemed quite right to me. This comment is long enough, though, so I’ll leave it here.
___________________________________________________________________________________________
Also, I forget what happens after God Emperor. I have barely any recollection what goes on in Heretics and Chapterhouse and have very little desire to read them again. I have never read Brian Herbert’s sequel books and never plan to.
I have a suspicion that p-zombie discourse is only going to get more relevant as LLMs get better. No one really argues that animals aren’t conscious, even though they can’t use words very well, but the release of GPT-3 caused a steady rise in people arguing that AIs are conscious. It’s not clear to me that an LLM couldn’t possibly be conscious, but it does seem that many people are taking LLM eloquence to imply that they are conscious, and I’m pretty sure we’ve been discussing this for years…