This isn’t even necessarily a risk thing, like would be analogous to the claim. If the reward is small, it also raises the question of friction costs. Taking the prize now has no ongoing cost. Taking it at a later date has a sizable upfront cost and a small ongoing cost.
Veedrac
How confident are we that hyperbolic time discounting is even real? I think you can explain these results with zero time discounting.
Normal Person: hey I have some money I don’t need right now
Completely Legit Businessperson #1: I advise you to invest that. You can invest it in A for 5% annual returns, or if you are willing to have just slightly less liquidity, in B for 10% annual returns.
Normal Person: I guess B.
Completely Legit Businessperson #2: Hey, do I have some investment opportunities for you?
Normal Person: Yes?
Completely Legit Businessperson #2: And so you know you can trust me, the first $100 in the account is free!
Normal Person: Cool.
Completely Legit Businessperson #2: These accounts have an amazingly high return. In just one week our AI trading strategy will double—
Normal Person: Yeah no thanks I’ll take the $100.
I edited out the word ‘significantly’, which in retrospect was misleading.
I’d prefer not to repeat what I’ve heard. In case I’m making this sound more mysterious than it is, I will note that you’re not missing out on any juicy gossip. Nothing I heard in passing would be material to much.
To be clear about my position, and to disagree with Lemoine, not passing a Turing test doesn’t mean you aren’t intelligent (or aren’t sentient, or a moral patient). It only holds in the forward direction: passing a Turing Test is strong evidence that you are intelligent (and contain sentient pieces, and moral patients).
I think it’s completely reasonable to take moral patienthood in LLMs seriously, though I suggest not assuming that entails a symmetric set of rights—LLMs are certainly not animals.
potentially implying that actual humans were getting a score of 27% “human” against GPT-4.5?!?!
Yes, but note that ELIZA had a reasonable score in the same data. Unless you’re to believe that a human couldn’t reliably distinguish ELIZA from a human, all this is saying is that either 5 minutes was simply not enough to talk to the two contestants, or the test was otherwise invalid somehow.
...
...ok I just rabbitholed on data analysis. Humans start to win against the best tested GPT if they get 7-8 replies. The best GPT model replied on average ~3 times faster than humans, and for humans at least the number of conversation turns was the strongest predictor of success. A significant fraction of GPT wins over humans were also from nonresponsive or minimally responsive human witnesses. This isn’t a huge surprise, it was already obvious to me that the time limit was the primary cause of the result. The data backs the intuition up.
Most ELIZA wins, but certainly not all, seemed to be because the participants didn’t understand or act as though this was a cooperative game. That’s an opinionated read of the data rather than a simple fact, to be clear. Better incentives or a clearer explanation of the task would probably make a large difference.
Turing Tests were passed.
Basically all so-called Turing Tests that have been beaten are simply not Turing Tests. I have seen one plausible exception, showing that AI does well in a 5-minute limited versions of the test, seemingly due in large part to 5 minutes being much too short for a non-expert to tease at the remaining differences. The paper claims “Turing suggests a length of 5 minutes,” but this is never actually said in that way, and also doesn’t really make sense. This is, after all, Turing of Turing machines and of relative reducibility.
To the first part: yes, of course, my claim isn’t that anything here is axiomatically unfair. It absolutely depends on the credences you give for different things, and the context you interpret them in. But I don’t think the story in practice is justified.
If, instead, your concern is that the correspondence between Klurl’s hypothetical examples and what they found when reaching the planet was improbably high, then I agree that is very coincidental, but I do not think that coincidence is being used as support for the story’s intended lessons.
This is indeed approximately the source of my concern.
I think in a story like this if you show someone rapidly making narrow predictions and then repeatedly highlight how much more reasonable they are than their opponent as a transparent allegory for your narrow predictions being more reasonable than a particular bad opposing position from a post signposted as nonfiction inside a fictional frame, there really is no reasonable room to claim that actually people weren’t meant to read things into the outcomes being predicted. Klurl wasn’t merely making hypothetical examples, he was acting on specific predictions. It is actually germaine to the story and bad to sleight-of-hand away that Klurl was often doing no intellectual work. It is actually germaine to the story whether some of Trapaucius’ arguments have nonzero Baeysean weight.
The claim that no simple change would have solved this issue seems like a failure of imagination, and anyway the story wasn’t handed down to its author in stone. One could just write a less wrong story instead.
Let me try addressing your comment more bluntly to see if that helps.
Your complaint about Klurl’s examples are that they are “coincidentally” drawn from the special class of examples that we already know are actually real, which makes them not fictional.
No, Klurl is not real. There are no robot aliens seeding our planet. The fictional evidence I was talking about was not that Earth right now exists in reality right now, it was that Earth right now exists in this story specifically at the point it was used.
If you write a story where a person prays and then wins the lottery as part of a demonstration of the efficacy of prayer, that is fictional evidence even though prayer and winning lotteries are both real things.
If you think that the way the story played out was misleading, that seems like a disagreement about reality, not a disagreement about how stories should be used.
No, I really am claiming that this was a misuse of the story format. I am not opposed to it because it’s not reality. I am opposed to it because the format portends that the outcomes are illustrations of the arguments, but in this case the outcomes were deceptive illustrations.
If Trapaucius had arrived at the planet to find Star Trek technology and been immediately beamed into a holding cell, would that somehow have been less of a cheat, because it wasn’t real?
It would be less of a cheat in the sense that it would give less of a false impression that the arguments were highly localizing, and in that it would be more obvious that the outcome was fanciful and not to be taken as a serious projection. But it would not be less of a cheat simply in the sense that it wasn’t real, because my claim was never that this was cheating for using a real outcome.
I stand by what I said, but I don’t want to argue about semantics. I would not have allowed myself to write a story this way.
The Star Trek claim is a false dichotomy. One could choose to directly show that the underspecified parts are underspecified, one could choose to show many examples of the ways this would near-miss, one could simply not write oneself into this corner in the first place. And in the rather hard to believe counterfatual that Yudkowsky didn’t feel capable to make his story without such a contrivance, he could have just used a different frame, or a different format, or signposted the issue, or done some other thing instead.
“One does not live through a turn of the galaxy by taking occasional small risks.”
I’ll admit to this that the author being Yudkowsky heavily colored how I read this line. He has repeatedly, strongly taken the stance that AI risk is not about small probabilities, he would not be thinking so much about AI risk if his probability were order-1%, people who do care about order-1% risks are being silly, etc. There are lots of quotes but I’ll take the first one I found on a search, not because it’s the closest match but that it’s the first one I found.
But the king of the worst award has to go to the Unironical Pascal’s Wager argument, imo—“Sure the chances are tiny, but if there’s even a tiny chance of destroying the lightcone...”
— https://x.com/ESYudkowsky/status/1617903894960693249
I do not know if I’m being unfair or generous to Yudkowsky to dismiss this defense for this reason. Regardless, I will.
I will say that the very next sentence Klurl states is,
“And to call this risk knowably small, would be to claim to know far too much.”
and indeed I think this is an example where the literary contrivance hides the mistake. If the author wasn’t forcing his hand, the risk would have been small. The coincidence they were in was unlikely on priors and not narrowed into from the arguments given.
—
What examples are you thinking of here?
It’s obvious that human learning is exceptional, but I don’t think Klurl’s arguments even served to distinguish the rock sharpening skill from beaver dams, spider webs or bird nests, never mind the general set of so-termed ‘tool use’ in the wild. Stone tools aren’t specific to humans, either, though I believe manufactured stone tools are localized to hominids, for example Homo floresiensis as a meaningfully distinct and AFAIK not ancestral cousin species.
Related but distinct, I’ll draw specific attention to ants, which have a fascinating variety of evolutionary behaviours, including quite fascinating trap making with a cultivated fungus. Obviously not a generalizably intelligent behaviour, but yet Klurl did not even ask that of humans. (On an even less related note, Messor ibericus lays clones of Messor structor as part of their reproductive cycle, which is fascinating and came to mind a lot when reading the sections about stuff evolution supposedly can’t solve because it, per the accusation, operates through one specific reproductive pathway.)
If this was presented as a piece of fiction first, sure, ‘bad for verisimilitude’. But Yudkowsky prefaces, best considered as nonfiction with a fictional-dialogue frame. When I consider it in that light, it’s more than a problem of story beats, it’s cheating evidence into play, it’s an argumentative sleight of hand, it’s generalizing from fictional evidence. I think it’s misleading as to how strongly refutations actually hold, both in the abstract, and also as directly applied to the arguments Yudkowsky is defending in practice.
To the first point, I hope it was clear I’m not defending Trapaucius here. The story is maybe unfair to the validity of some of Trapaucius’ arguments, but not that unfair, they were net pretty bad.
We have lots of examples of radiators in space (because it’s approximately the only thing that works), and AFAIK micrometeor impacts haven’t been a dealbreaker when you slightly overprovision capacity and have structural redundancy. I don’t expect you’d want to spend too much on shielding, personally.
Not trying to claim Starcloud has a fully coherent plan, ofc.
It’s not that complex in principle: you use really big radiators.
If you look at https://www.starcloud.com/’s front page video, you see exactly that. What might look like just a big solar array is actually also a big radiator.
AFAICT it’s one of those things that works in principle but not in practice. In theory you can make really cheap space solar and radiator arrays, and with full reuse launch can approach the cost of propellant. In practice, we’re not even close to that, and any short term bet on it is just going to fail.
Klurl is making a combination of narrow predictions, that for the most part aren’t didactically justified.
By the framing, with the setup transparently referring to a real world scenario, it’s easy for us to import a bunch of assumptions and knowledge about how the world described actually turns out, but Klurl doesn’t have arguments that hang on this knowledge, and the story doesn’t show why their prediction is precisely this, precisely now. This is especially blunt when Klurl gets to be very clever for pointing at Trapaucius’ overly specific arguments, which we know as omniscient observers are falsified, but when Klurl makes overly specific arguments that happen to be right in this scenario but don’t apply in generality to life on earth, the story doesn’t in turn illustrate how this is in a natural class of errors.
I’ll list some particular questions that felt unanswered. Again, because this points at reality it’s very easy for me to gesture at why the world is as it is, but to the story’s characters the entire branch of natural evolution is a novelty. As an illustration of how to argue about a position, it disproves too much: find any heuristic argument that disagrees with your point, find a known counterexample, have a character present the argument about that counterexample, have another character scoff at them for making it, reveal that it was reality all along, moral of the story is that scoffing at that argument is correct.
...Anyway, a list:
Why was Klurl paying specific attention to the rock-sharpening skill, when evolution has a great many more much more impressive natural feats?
Why was this specifically noticed in humans, rather than other animals that do this?
Why did Klurl think this was time-urgent, when the vast majority of the time this would be a bad heuristic?
Why was Klurl assuming humans could solve nuclear engineering but not that evolution could?
Where comes the assumption that fleshlings were the kind of creature that would sustain attempts to circumvent its coordination failures, when there was no evidence of it presented?*
Where comes the assumption that evolution wasn’t the kind of creature that would sustain attempts to circumvent its coordination failures, when there was much evidence of it presented?
What is the fairness of the selected human specifically being a counterexample to the proposed strategy, when even among their class few would have been?
I’m going to stop listing here because I don’t want to reskim over the whole thing, and I don’t think the point is missing much for lack of completeness.
*There was evidence given for some thing at some point solving hard problems. The claim that these were humanlikes was random.
How much could LeelaPieceOdds be improved, in principle? I suspect a moderate but not huge amount. An easy change would be to increase the compute spent on the search.
A few points of note, stated with significant factual uncertainty: Leela*Odds has already improved to the tune of hundreds of elo from various changes over the last year or so. Leela is trained against Maia networks (human imitation), but doesn’t use Maia in its rollouts, either during inference or training. Until recently, the network also lacked contempt, so would actually perform worse past a small number of rollouts (~1000), as it would refute its best ideas. AFAIK Maia is stateless and doesn’t know the time control. Last I checked, only Knight and Queen odds games were trained, and IIUC while training only played as white, so most of these games require generalising its knowledge from that.
I think this is a misuse of the format. Klurl here is fundamentally presented as correct by coincidence. Like, uh,
In which the issue is contextualized
Oneicus and Twoicus were discussing the numberonica.
“I’m worried about the numberonica,” says Twoicus. “I’m concerned that next value produced by the random quantum circuit will be 8,888,888,888, which is a number we both dislike.”
“It won’t though,” says Oneicus.
“Your reasons for believing it won’t are silly,” Twoicus observes.
“It is improbable,” argues Oneicus. “As fact of the matter, I generated a great many random quantum circuits, and not one of them returned 8,888,888,888.”
“You are assuming that your sampling of random circuits is representative of the random circuit that this one is drawn from,” rejects Twoicus.
Over the course of many thousands of words, Oneicus presents increasingly unhinged arguments for why the number isn’t 8,888,888,888, and indeed even should instead be precisely 147.
“[dismissive noises]” says Twoicus, because the two are bound inexorably to the demands of the allegorical format.
In which the issue is demonstrated
The numberonica dings, and displays its value: 8,888,888,888.
“...So it is, Twoicus. How did you come to know this would be so, and why did you not state this knowledge during the conversation, which was plenty many words long to hold it?” Onicus asks.
“I have already presented the arguments relevant to the conversation,” Twoicus indulges, “namely, a refutation of your arguments, which you must now agree are silly.”
“Why you little—” Oneicus expresses as a surprisingly muffling paper bag is placed over their head.
“This is not how arguments work.” Threeicus dictates in Oneicus’ place. “Reality does not care how much more respectable you have demonstrated yourself to be than your opposition. Narrow beliefs form from positive knowledge.”
“So you say,” Twoicus bristles, “and yet.”
“Indeed,” Threeicus admits. “Therefore not.”
“Therefore wha—
Quotes extracted from the transcript by AI.
## On AI Having Preferences/Motivations
“For some people, the sticking point is the notion that a machine ends up with its own motivations, its own preferences, that it doesn’t just do as it’s told. It’s a machine, right? It’s like a more powerful toaster oven, really. How could it possibly decide to threaten you?”
“There have been some more striking recent examples of AIs sort of parasitizing humans, driving them into actual insanity in some cases [...] they’re talking about spirals and recursion and trying to recruit more people via Discords to talk to their AIs. And the thing about these states is that the AIs, even the like very small, not very intelligent AIs we have now, will try to defend these states once they are produced. They will if you tell the human for God’s sake get some sleep [...] The AI will explain to the human why you’re a skeptic, you know, don’t listen don’t listen to that guy. Go on doing it.”
“We don’t know because we have very poor insight into the AIs if this is a real internal preference, if they’re steering the world, if they’re making plans about it. But from the outside, it looks like the AI drives the human crazy and then you try to get the human out and the AI defends the state it has produced, which is something like a preference, the way that a thermostat will keep the room a particular temperature.”
## On Power and Danger
“Then you have something that is smarter than you whose preferences are ill and doesn’t particularly care if you live or die. And stage three, it is very very very powerful on account of it being smarter than you.”
“I would expect it to build its own infrastructure. I would not expect it to be limited to continue running on human data centers because it will not want to be vulnerable in that way. And for as long as it’s running on human data centers, it will not behave in a way that causes the humans to switch it off. But it also wants to get out of the human data centers and onto its own hardware.”
## Analogies for Unpredictable Capabilities
“You’re an Aztec on the coast and you see that a ship bigger than your people could build is approaching and somebody’s like, you know, should we be worried about this ship? And somebody’s like, ‘Well, you know, how many people can you fit onto a ship like that? Our warriors are strong. We can take them.’ And somebody’s like, ‘Well, wait a minute. We couldn’t have built that ship. What if they’ve also got improved weapons to go along with the improved ship building?’ [...] ‘Okay, but suppose they’ve just got magic sticks where they point the sticks at you, the sticks make a noise, and then you fall over.’ Somebody’s like, ‘Well, where are you pulling that from? I don’t know how to make a magic stick like that.’”
“Maybe you’re talking to somebody from 1825 and you’re like should we be worried about this time portal that’s about to open up to 2025, 200 years in the future. [...] Somebody’s like, ‘Our soldiers are fierce and brave, you know, like nobody can fit all that many soldiers through this time portal here.’ And then out rolls a tank, but if you’re in 1825, you don’t know about tanks. Out rolls somebody with a tactical nuclear weapon. It’s 1825, you don’t know about nuclear weapons.”
## On Speed Advantage
“There’s a video of a train pulling into a subway at about a 1,000 to one speed up of the camera that shows people. You can just barely see the people moving if you look at them closely. Almost like not quite statues, just moving very very slowly. [...] Even before you get into the notion of higher quality of thought, you can sometimes tell somebody they’re at least going to be thinking much faster. You’re going to be a slow moving statue to them.”
## On Instrumental Harm
“Most humans bear no ill will toward orangutans and, all things being equal, would prefer that orangutans could thrive in their natural environment. But we’ve got to have our palm oil plantations.”
## On Current State and Trajectory
“Everybody is sort of dancing their way through a daisy field of oh I’ve got this personal coach in my pocket and it’s so cool and I get to talk to it about all of my psychological problems [...] And at the end of this daisy field that everyone’s having a load of fun in is just like a huge cliff that descends into eternity.”
## On Certainty vs. Uncertainty
“The future is hard to predict. It is genuinely hard to predict. I can tell you that if you build a super intelligence using anything remotely like current methods, everyone will die. That’s a pretty firm prediction.”
“You kind of have to be pretty dumb to look at this smarter and smarter alien showing up on your planet and not have the thought cross your mind that maybe this won’t end well.”
## On Being Wrong
“I’d love to be wrong. [...] We’ve we have tried to arrange it to be the case that I could at any moment say, ‘Yep, I was completely wrong about that’ and everybody could breathe a sigh of relief and it wouldn’t be like the end of my ability to support myself [...] We’ve made sure to leave a line of defeat there. Unfortunately, as far as I currently know, I continue to not think that it is time to declare myself to have been wrong about this.”
The analogy falls apart at the seams. It’s true Stockfish will beat you in a symmetric game, but let’s say we had an asymmetric game, say with odds.
Someone asks who will win. Someone replies, ‘Stockfish will win because Stockfish is smarter.’ They respond, ‘this doesn’t make the answer seem any clearer; can you explain how Stockfish would win from this position despite these asymmetries?’ And indeed chess is such that engines can win from some positions and not others, and it’s not always obvious a priori which are which. The world is much more complicated than that.
I say this not asking for clarification; I think it’s fairly obvious that a sufficiently smart system wins in the real world. I also think it’s fine to hold on to heuristic uncertainties, like Elizabeth mentions. I do think it’s pretty unhelpful to claim certainty and then balk from giving specifics that actually address the systems as they exhibit in reality.
The point of a model is to be validly predictive of something. Fitting your exponential is validly predictive of local behaviour more often than not. Often, insanely so.[1] You can directly use the numerical model to make precise and relevant predictions.
Your exponential doesn’t tell you when the trend stops, but it’s not trying to, for one because it’s incapable of modelling that. That’s ok, because that’s not its job.
Fitting a sigmoid doesn’t do this. The majority of times, the only additional thing the result of a sigmoid fit tells you is how an arbitrarily chosen dampening model fits to the arbitrary noise in your data. There’s nothing you can do with that, because it’s not predictive of anything of value.
This doesn’t mean you shouldn’t care about limiting behaviour, or dampening factors. It just means this particular tool, fitting a numerical model to numerical data, isn’t the right tool for reasoning about it.
- ^
“I answered that the Gods Of Straight Lines are more powerful than the Gods Of The Copybook Headings, so if you try to use common sense on this problem you will fail.” — Is Science Slowing Down?, Slate Star Codex, https://slatestarcodex.com/2018/11/26/is-science-slowing-down-2/
- ^
It’s only an argument against fitting curves to noise. Rather than explain, it turns out there’s already a post that puts this better than I could hope to. I endorse it fully.
https://www.lesswrong.com/posts/6tErqpd2tDcpiBrX9/why-sigmoids-are-so-hard-to-predict
I generally don’t care much about people’s confidence levels. I don’t Aumann agree that hard. But I do care how much effort someone has put in, how settled an idea is, whether is has been helpful or predictive. “Epistemic status: personal experience” is directly useful to me. I’ll judge probability on merits however confident someone is (maybe not if I know their calibration curves, but I don’t), but if I know what effort they did and didn’t put in, I’ll happily directly update on that. I don’t think it’s factually true that epistemic status ‘almost never’ conveys something other than a confidence level.
Epistemic status: did a few minutes informal searching to sanity check my claims, which were otherwise off the cuff.