Sam Kriss is notably anti-rat so probably not
Harjas
I also wonder how much context plays into this. In normal life, I usually don’t assume that my conversation partner knows what I’m talking about, so I try to give as much context as I think necessary for them to understand. On the internet you can sometimes just write for people who already know what you’re talking about.
For example I’ve noted before that I suspect current levels of AI persuasion/manipulation capabilities has lagged behind AI capabilities overall (this is nonobvious and I don’t defend it here but my impression from talking to empirical researchers on AI persuasion is that they broadly agree with me).
IMO this is also visible in AI writing. I would imagine persuasion and writing capabilities should go hand in hand, and while AI is getting better at producing stuff people are willing to read, it’s still obviously not a human-replacement level (and indeed has mostly succeeded in automating formulaic/procedural writing like summaries and lit reviews). Will be interesting to see this going forward.
I have a suspicion that p-zombie discourse is only going to get more relevant as LLMs get better. No one really argues that animals aren’t conscious, even though they can’t use words very well, but the release of GPT-3 caused a steady rise in people arguing that AIs are conscious. It’s not clear to me that an LLM couldn’t possibly be conscious, but it does seem that many people are taking LLM eloquence to imply that they are conscious, and I’m pretty sure we’ve been discussing this for years…
Ooh very interesting. I fall for throat-clearing a lot myself. I do think it’s kinda hard to avoid context and overview, though, and intros can be nice stylistically? But maybe this is just me making excuses for my bad habits...
Some conjectures:
General cultural ossification: caused by widespread recording and information technology devices.
Cultural splintering: things change so fast nowadays that there simply isn’t time for new symbols to catch on before they’re replaced by even newer symbols (and culture is also so fragmented that there isn’t really a single power center that can take over).
Standardization: now that symbols tend to be standard across international communities, changing things might require a LOT of coordination.
Low hanging fruit has already been plucked: potential improvements (if they exist at all) might be minor and not worth overhauling already-standardized systems.
Symbols are still developing: how long has the like button been around? Or the karma system? Maybe we’ll keep innovating our symbology as new needs arise.
I wish he’d given some common failure modes and ways to fix them. Like I completely agree with the main point, but without concrete examples, I have a hard time applying this advice to my own writing except for “try harder dingus” which is often unhelpful.
Gotcha. Is there a strong reason to assume that we’ll succeed at creating AIs that can be pointed at a single target? I read this post and comment a while back and would love your thoughts.
I’m a little surprised by the amount of disagree reacts, given that no one has replied.
I keep running into conceptual confusion around the term “alignment,” particularly when reading older Less Wrong posts. Some people say “aligned AI” and mean “an AI that works for human flourishing,” some people say that an AI “is aligned” if it reliably advances the intended objectives of some person or group (and doesn’t have some secret set of goals / isn’t scheming), and yet other people use “alignment” to mean something along the lines of “the ability of any system to reliably work towards some pre-defined goal.” I usually have to work out which is being said on the spot, which is annoying given that the implications of each are very different.
Is there one commonly accepted definition? Is this confusion just a thing we’ve all accepted?
A bit of a necro-comment from me, but I’m reading this about four years later and am very surprised that this is the first time I’m hearing about the concept. I can’t think of anything in the intervening time period that has either confirmed or deconfirmed this comment, let alone even engaged with it.
For the record, I think this is helpful and will be stealing it for any future advice posts I might write!
Has anyone made a proper post about potential “warning shots” and how we should prepare for them? This post has lived rent-free in my head for the past couple of months and I’m curious to know if anyone else has been thinking about this topic too.
Re character: I think most Americans (including myself) have been so far removed from true corruption that we have forgotten how bad it can possibly get. Even my state of Illinois, which is notable for its historical machine politics and general corruption (4 of our 11 last governors serving time + many others like Mike Madigan), has still more or less seen forward progress, because the corruption wasn’t bad enough to completely erode politics in the state.
But it CAN get that bad. We’re seeing this now with the Trump admin. I am generally left-leaning, but at this point I think I’d take an honest Republican over a corrupt Democrat—a position I did not hold previously—because corruption eats policy and utterly erodes the foundation upon which we build fair markets and strong institutions.
Thought in progress: epistemic humility is not a substitute for actual humility (or professed humility). You only get to cry wolf once, but you can probably warn about potential wolves several times—so long as you don’t burn goodwill on an incorrect or overconfident prediction.
I think epistemic humility helps to increase trust and confidence in EA/Less Wrong-type spaces, but I think professed humility is far more helpful when it comes to public-facing AI comms, particularly as scenarios get more intense and specific (e.g. prefacing AI doom predictions with a decent amount of throat-clearing beforehand commensurate with the intensity and specificity of the forecast). For example, I think that AI 2027 might have been better received if the authors had spent less time trying to convince the readers of their credibility at the beginning and spent more time saying something along the lines of “we know this sounds crazy and are well aware of how sci-fi the scenario seems”. (I’m not a huge fan of lampshading in fiction, but IRL, I think you do need to display self-awareness of outlandishness in order to be taken seriously, particularly if what you’re predicting sounds insane to the average person.)
Of course, there are huge diminishing returns on this: the more throat-clearing you do, the less confident you seem. And throat-clearing should probably be saved for public-facing comms, because actual technical work seems to require people who are confident in their beliefs even when they are outlandish (as proven by the outlandish explosion of AI progress recently).
Still, I think that the AI safety community at large has a worse reputation than they deserve, and I think part of that is due to the appearance of overconfidence. This problem seems simple, tractable, and important.
Random future reader from 14 years in the future: seconded. And also, why didn’t they just use i.e.?
I don’t think that the belief that godlike intelligence is necessary for human extinction via AI is a popular AI doomer position among people who are intellectually sophisticated. It’s more like those people hold complex position and it’s easy for people who are skeptics to frame this as “a popular position”.
Hang on, I don’t think I said that godlike intelligence was necessary for human extinction, and actually, didn’t make any claim about human extinction at all. This post was just about the possibility of an intelligence explosion, and I think “AI will reach godlike levels of intelligence” is an accurate description of the AI 2027 position.
You can’t conclude from the fact that inference scaling happened that most AI improvements are due to scaling.
Did you read the cited link that you quoted? Toby Ord’s argument was pretty convincing to me. What do you disagree with?
When it comes to inference it’s also worth noting that they found a lot of tricks to make inference cheaper. It’s not just more/better hardware
Right, ending in about late 2024, which is why I specified (~late 2024) in “most recent gains”. It doesn’t seem like that trend has continued.
Re misframing: fair enough. Maybe I should have said “a popular AI doomer position”.
On the other thing: I’m not quite sure what you mean? My thesis in the quoted text was basically what I said: since most AI improvements have come from inference scaling, aka scaling up compute requirements, we can expect that future progress will also come from scaling up compute requirements. Obviously this only holds true until another paradigm shift happens.
Do you think agents will be trained on themselves in a similar fashion to AlphaGo, and do you think that training will reduce compute requirements / provide a performance increase driven by training instead of inference?
I mean I agree too, I just don’t think he’d ever become a card-carrying rat (which tbf I am also not myself)