“Eventually”, sure, but I don’t think that’s operative here. If we had the ASI recipe and could study it safely for ten years, we’d find a way to implement it in a single datacenter. But discovering it in a single data center is much harder. There is actually something missing from current LLMs, there’s a part of intelligence they just don’t have, and the only thing that seems to mitigate that issue is model size, so without ever-increasing model size and analysis of their training dynamics, I think any attempts to get the missing piece are throwing darts with the lights off. (To be fair I have pretty unusual timelines compared to most of LW so maybe what’s convincing to me shouldn’t be to you.)
speck1447
speck1447′s Shortform
At least through the web app, Gemini 3.1 Pro is almost (like 80% maybe? It feels 80% as bad to me) as sycophantic as 4o was.
I definitely expect that problems at this level of difficulty are within reach for present frontier models. That being said, as I understand it, most labs are still soliciting expert data and doing human-in-the-loop process reward modelling, and those that aren’t (mostly because they think RLVR is better and they have the spare compute) are still using the data they solicited in the past, or are distilling from models that used that data, or etc. etc. For basically the past two years, any math problem which is known to stump LLMs even occasionally is worth ~75 dollars to any contractor in any part of the world working as a data generator for companies like Scale AI. You should expect that any math problem which has been posted publicly, seen by more than ~50 people, and stated to be hard for LLMs in that time period has been trained on, detached from the canary string.
Knowing that all rational numbers can be represented is a big hint and would have cut at least my solution time in half. This is still probably a good test, and although I’m sure it’s been trained on, it’s not too hard to come up with “similar” puzzles where knowing about this one doesn’t immediately solve it.
Reads more as manic than rehearsed to me, but I’m not sure I see how the distinction matters. Usually I assume that if somebody has thought through what they want to say before they say it, they’re more likely to give their real thoughts as a result, as opposed to some reactively oppositional take. I guess there’s the Andy Kaufman defense?
(I guess I should mention, there’s at least one way that the distinction is relevant here. At the first pause I indicated, it seems like they were about to say that they want their political opponents wiped off of the face of the earth, but catch themself in time to moderate to something slightly less evil. I read this as instinctively reaching for the worst thing they can think to say about people they hate but not actually being committed to the content. If I thought this were more rehearsed, I think I would read it as at least some small percentage of desire for political genocide against the left. But this involves a bit too much speculation for my taste, I’m much more concerned with the claim I originally quoted, which strikes me as unsalvageably naive.)
For example the “Poor and Proud” and “March 4 Hundredaires” signs are sentiments that literally every Pro-Billionaire protestor would gladly endorse
See https://x.com/twocents/status/2020596821228388704
In particular, note the following exchange
> Interviewer: “Is there any ways you would restructure current incentives to make, to allow for, the protestors and the antiprotestors to like see eye-to-eye?”
> Pro-Billionaire Protestor: “I don’t want to see eye to eye with them! I want to destroy them. [cut, apparently to later in the same answer] All manner of socialists and communists that are motivated by jealousy, I want to wipe them off (pause) of the political spectrum. I want to make it (pause) not allowed for you to support this.”
There’s a cut here, perhaps there’s some intervening context that changes the conclusion I should draw from this exchange, but I really very strongly doubt it.
Not at all. There is no such set of characteristics. Wrong conclusions are inevitable and commonplace. Godel’s Theorems apply to all formalisms.
They do not apply to all formalisms, morality is not a formal system, and even if it were this is not what either of Godel’s theorems would say about it. I don’t know why this particular bit of math misunderstanding is so popular online, I suspect it’s because it enables moves like the one you’re making here (i.e. of the form “it’s impossible to justify any statement so I can’t be expected to justify my statements”).
In the current world, the harm of unsanctioned killing being commonly accepted (and cheered) is generally a LOT higher than the harm of statistically-evil people continuing to live. So, yes, a heuristic argument: this is a loss of civilization and order, even if it might have been justifiable on some dimensions.
Ah, I wouldn’t call this a heuristic argument—by “heuristic argument” I mean something like “I can’t come up with any utilitarian calculation that says the bad outweighs the good here, but I know that human brains are prone to underestimating this sort of bad, so I assume there is a calculation saying it was overall bad even if I don’t know what it is.” (Incidentally this is how I understand this situation.) If you have an argument to this effect, I’d love to see it! But to satisfy me it will need to be the sort of argument that permits killing Stalin or Hitler or Idi Amin with a pretty wide margin for error, and if it’s not I’ll make the same critique I did before.
“is it ok to kill people (or call for the killing or support the killing) who have not been convicted by any court and the killing does not stop any immediate physical threat to you?”
When you phrase the question like this, do you think that you’ve identified a set of characteristics that, throughout history, will never lead to the wrong conclusion? Or are you making a heuristic argument?
I think that throughout history, there have probably been many “ok” killings against people who did not present an immediate threat to the killer and had not been convicted by any court. I think that even today there are probably such killings. Do you have an argument against that position or do you think that, by phrasing things in the way you have, you’ve implicitly made a sufficient argument already?
(This is not to say that the UHC CEO killing in particular was justified, of course.)
I mostly agree with this, principled and robust estimations of effect size are hard and also important. Maybe someday I’ll write a primer on Judea Pearl covering Bayesian networks and causal graphs, which in my mind is the framework that unifies these approaches, but that would take me a while and would require some more research, so I didn’t get into it here.
By calling something the “p-value” you are elevating the null hypothesis to a special status
Of course when you call something a p-value you should in your mind add “(for a particular choice of null hypothesis)”, calling something “the” p-value doesn’t really make sense. But I don’t think it’s correct to say that this elevates the null hypothesis to a special status; in many (perhaps even most) cases, there’s a hypothesis which already has special status, and addressing that hypothesis in particular is productive in a way that cataloguing and grouping several alternate hypotheses is not.
(and usually leaving various things about it underspecified)
This is precisely the opposite of the truth. As I tried to explain in the original post, roughly the entire benefit of reporting p-values instead of full bayesian updates is that if we try to do a full update, we will necessarily underspecify many hypothesis classes, but by focusing on one hypothesis (in the cases where there’s a hypothesis that it makes sense to focus on), especially if that hypothesis is deliberately “simple” (something like, this intervention does not have an effect on this observable), we can fully specify it. This is exactly the problem that p-values solve.
Like, it seems like your post is trying to say something like “ah, no, don’t do bayesian statistics, p-values are better sometimes actually”. But no, bayesian statistics in this sense is just better and more straightforward, as far as I can tell, and you get the things you would get from a p-value by default if you did any bayesian statistics.
I don’t think this is what my post says or that this is a plausible reading of it. A p-value is a conditional probability and since a principled bayesian update involves considering all of the relevant conditional probabilities, of course it contains all of the information that a p-value gives, and indeed the only way to interpret a p-value is in this light. But we can hardly do principled bayesian updates for ourselves and we can’t effectively communicate them, and moreover, if there are competing explanations for the data, we often can’t tell which explanation is best or if a correct explanation is among the hypotheses we’re really considering. In these cases, the correct amount of our internal update to report is often just a p-value.
The concrete examples I have in mind are the discoveries of CMBR and of the muon—Penzias and Wilson were not aware of the prediction of relic radiation, and nobody had even hypothesized the muon. It turned out that some Big Bang theorists at the time had started thinking about the possibility of microwave radiation left over from the Big Bang, and so Penzias and Wilson were eventually made aware of this work and realized what they had discovered, but when they conducted their experiment, what was relevant was not the full bayesian update but one conditional: their observations simply were not consistent with a steady state universe. They did not need to know about the alternative hypotheses to know that this was important. In the other example, Yukawa had predicted the existence of mesons before the muon was discovered, in particular he had predicted what we now call the pi meson, and since the mass of the muon matched the predicted mass of the pi meson, many people at the time guessed that Anderson and Neddermeyer had observed the pi meson. But they had not! Of course it wasn’t necessarily a mistake to update toward the best available hypothesis, but nonetheless, it was incorrect. The relevant discovery here was not that Yukawa’s prediction fit the newly-observed particle better than any other available explanation, it was simply that the newly-observed particle could not be any previously-observed particle, and so something new had been discovered.
Bayesian reasoning does work fine here, but if you were trying to communicate how you changed your mind about the original hypothesis, you wouldn’t report all your updates, because you wouldn’t (and shouldn’t) go through the process of enumerating all of the alternatives you considered and the likelihoods under those alternatives and your priors and justifying an estimate of the likelihood under alternatives like “or something I haven’t thought of”. If you’re interested in a distinguished hypothesis, which you almost always are (hypotheses like “this intervention has no effect” or “the normal explanation of how this process works is correct” are basically always available), then the most important thing you should report is the probability of the evidence under that distinguished hypothesis, since the updates on that hypothesis should agree even if the auxiliary updates do not and that’s the hypothesis your peers will tend to care about the most.
You can communicate the same thing in a bayesian fashion, you just need to specify the class of hypothesis you are declaring the “null hypothesis”.
Yes this is what a p-value is? Perhaps I am confused, are you saying something here that I’m not saying? (Although note that we cannot perform an update this way, we need conditional probabilities for the evidence under both the hypothesis under consideration and its complement.)
Yes, of course any bayesian analysis will need to require creating classes of hypotheses and assigning odds-ratios to them. So does doing any kind of analysis with p-values, it’s just that with p-values you are elevating one such class of hypotheses to a special status of “null hypothesis” and claim objectivity when none such exists.
Only in the sense that one hypothesis can form a class! It’s extremely reasonable to say something of the form “the normal understanding of this phenomenon implies this outcome distribution, the true outcome was very unlikely under that distribution, thus we should think harder about the part of the normal understanding that addresses this situation”, and we do not need to really come up with an alternate hypothesis to do this. I agree that you shouldn’t compute p-values for hypotheses that you don’t have reason to believe in advance will be prominent in the minds of people you want to communicate your results to, if that doesn’t address what you’re saying about objectivity (actually even if it does) then I’m pretty confused by the last clause here.
p-values are good actually
I think our mental models here might be different enough that it’s hard for me to understand what you’re saying here. By nonlinearity here I mean that, in addition to nonlinear interactions between drugs, there are interacting systems, equilibration mechanisms, etc., to the point that I think intuitions about ML systems basically shouldn’t transfer at all. But then I know your intuitions about ML are better than mine, so it’s hard to be sure of that.
Re: interactions specifically, this definitely isn’t true in polypharmacy situations. We know most of the bad drug pairs in the normal population, and because doctors are wary of prescribing many different medications, this means we rarely encounter new bad interactions in the normal population. But there are drug combinations that only become dangerous in triples (search terms: the Triple Whammy, a combination of 3 drug classes, any 2 of which are generally safe but which cause kidney failure in combination, this interaction was discovered in 2000 but the drugs became available in like 1980), there are interactions which are only dangerous in the context of certain mutations (for example there are ultrametabolizers who simply can’t use prodrugs like codeine).
Interactions like this are rare right now largely because doctors are wary of prescribing too many drugs at once, but polypharmacy is becoming more common and more bad interactions are emerging as a result, basically just for combinatorial reasons. It’s definitely possible for combinations of drugs to be prescribed safely and for them to just not interact, but if we push this further, I suspect there are very few combinations of, say, 10 drugs that are simultaneously safe for most people (even if we ignore cholinergic response).
A huge percentage of the job of a pharmacist is to keep track of potential negative interactions between different drugs, of which there are an incomprehensible number. I don’t think linearity is a reasonable assumption here, the interaction terms between multiple interventions should be though of as, on average, big. Augmentation and synergistic effects exist, but are in general risky and quite hard to find. Even the effects of one drug are not linear, there are significant nonlinearities in dosage effects for most drugs.
Partisanship in the US could be something other than anti-Trump sentiment. There’s no logical necessity for it to be that, after all. It just isn’t actually separate from anti-Trump sentiment. (Outside the lizardman constant.)
Anti-Obama sentiment was not partisan? Anti-Biden sentiment is not? Anti-Zohran sentiment is not? Anti-ICE sentiment is not? Anti-DEI sentiment is not? Anti-Somali sentiment is not? Antisemitism is not? Do you actually believe this?
I have no idea which argument you’re referring to.
The paragraph beginning “of course it’s increased partisanship” was my main target here.
First of all, you are demanding that you can attack all you want, but nobody gets to defend. No. This is like the difference between initiation of force and self-defense. If you’re going to argue that Trump is uniquely bad to the point where norms can be violated to tell everyone how bad he is, then everyone else gets to say that you are overreacting. You started it.
Of course, I did not start it. (Indeed you’ll notice that I’ve been extraordinarily careful not to make any claims like this in our discussion, even though I think they can be justified, because I respect your desire to keep those discussions off of LW. EDIT: To clarify I’ve stated my position, what I mean is that I haven’t argued for it or tried to provide any reason to believe that I’m correct.) I challenged you on what you claim is your point, you decided proactively to jump down from the meta to the object level at that point. But someone somewhere did this, so you’re allowed in “self-defense” to make this pivot when talking to me. I wonder if you’ve thought about the bad incentives this behavior produces? Or is that the sort of thing only your ideological opponents are supposed to concern themselves with?
Second, my point is “you shouldn’t post about it here regardless of whether you’re overreacting.” It doesn’t matter how genuinely bad Trump is; you (and the OPs) shouldn’t be posting about him either way.
I have been trying to get you to argue for this, but you’ve refused three times now! Do you actually believe it to be true that, entirely regardless of how bad they actually are, nobody should ever talk about political figures on LW? Like, if Satan himself were president of the United States and was killing a million people per day, eliciting celebration from his supporters, would you still think discussion was not justified on the grounds that it’s political? If so I’d like you to defend that belief rather than just stating it, as I am now asking you to do for the fourth time. If not, I’d like you to explain what criteria you’re using to decide whether discussion of political figures is acceptable (of course you don’t have to draw hard-and-fast lines, but at least tell me what the relevant methods of evaluation are), and admit that deciding whether norm-breaking is justified will require at least a bit of discussion of object-level truths.
All right. So, if I’m reading this correctly, by partisanship you don’t mean partisanship (something everybody agrees has increased in the US), but instead some bizarre phenomenon which shares some characteristics with partisanship but is only capable of being realized as broad anti-Trump sentiment? And we (by “we” I mean LW in public, of course individuals can draw their own conclusions, sorry if I wasn’t clearer about this before) should refuse to speak clearly about certain issues, because if norms are more fluid that has some bad incentives (and what about the incentives of being unable to discuss certain topics? Well, it would be norm-violating to acknowledge those!) But also, it’s “of course” your preferred explanation—conveniently, it doesn’t matter whether you’re right or wrong, but you’re obviously right. And why is it obvious? Well, you can think of one plausible mechanism by which you could be correct, modulo the fact that your definition of partisanship is for some reason impossible to apply to Biden or Zohran, whom plenty of people think are the worst thing ever, even on here. And, well, sure there are other explanations, but considering those other explanations would be norm-violating, so, no more thinking needed. Is this really what you mean when you say “of course it’s partisanship”? Because I can’t think think of another way to form what you’ve said into an argument for your position.
It seems like your pattern of argumentation is: take a position, think up one single way that position could be true, then assert that any alternate explanations are damaging to the community to discuss. Surely you understand the difference between an argument being difficult to challenge for social reasons and that argument being convincing, right? You can’t Emperor’s New Clothes your way into political consensus. I don’t think you’re doing this on purpose, but I strongly recommend trying to stop doing it on purpose.
Of course, you could stop trying to make political arguments entirely and only make meta-arguments, that would be more respectable. If you really think there are no circumstances whatsoever under which LW should talk about politics, you can say that, you’ll just have to explain why you think the bad incentives that produces are more bearable than those it eliminates. But that would also mean giving up on the other arguments you’re making here—if people shouldn’t be allowed to argue that Trump is genuinely exceptionally bad regardless of its truth value, then you’re also not allowed to argue that those people are overreacting regardless of it’s truth value. You don’t get to have it both ways.
Votes also come in integers, but are not so small, so I don’t think you’re really providing any support for your hypothesis here. We’ve seen broadly anti-Trump and anti-Republican posts, they’ve usually been met with pretty strong disagreement (even just as a matter of form, often from people who agree with the overall sentiment) and a desire to avoid drawing too many conclusions, but this has changed recently. We can tell that it’s changed by noticing patterns in discourse and voting. That five months ago detailed anti-Trump posts were being made, downvoted, and mere expressions of disagreement in the comments were pretty roundly applauded, but now people (basically correctly) take for granted that the median LWer is so strongly anti-Trump that the position doesn’t need justification, is exactly what I’m pointing out.
There are multiple possible explanations, sure, but merely that this is a partisan issue isn’t one of them. That it’s a partisan issue and that partisan hostility is worse and more common now is, but as I mentioned, it requires argument. Have you noticed this pattern on other issues here? Is there a breakthrough effect, such that it happens first on one issue without seeming to change other discussions? If so, why? What other patterns of behavior on LW does this fit, and what does it predict?
The relevant counter-hypothesis is that Trump really is that bad, such that suspending judgment on whether he’s bad impedes discussion even if it elides some nuance. Elsewhere you’ve claimed that this isn’t a hypothesis we should allow ourselves to even consider no matter how true it is, here it seems like you’re arguing that it’s basically wrong in the sense that increased partisanship in general rather than exceptional circumstances are responsible. I think it’s pretty likely that lots of people are overestimating how bad Trump is somewhat, but not to an extent that is easy or productive to correct—I think I’d basically have to lie to someone to convince them that Trump is actually completely fine and totally in line with what they should expect, to the extent that breaking communication norms to mention his mistakes is escalatory. Do you think I’m wrong, or just that I shouldn’t talk about it on LW? It’s really unclear to me what you’re trying to argue for, and hence, what (if anything) I can expect to get from discussing this with you. (I notice this sounds snarky, what I mean to say is that since disagreements can get pretty wide hear and require a ton of tension to sort out, I can imagine either talking further or agreeing to disagree being reasonable actions for both of us, depending on what your answer here is.)
It is possible to get evidence for this claim without blind tests. For example: start interacting more with prose from an LLM you don’t interact with often (I recently discovered that I like Kimi K2.5′s prose much better than Claude’s, for example, so I’m interacting with it more). Track your ability to distinguish that LLM’s outputs (and your subjective taste/distaste for those patterns) over time. If you start to dislike tics that you didn’t notice before, that’s reasonable evidence that you’ve come to associate those tics with writing that lacks the sort of interiority described here, or at least with writing that lacks some desirable quality that’s hard to specify.