What this sounds like to me is a system, where all of the parts are designed cleverly with the assumption that their costs are going to be amortized, but because the current reality isn’t fitting the original assumptions, you’re hitting cache misses and full cost on every single operation
Hastings
In the discussion of the buck post and elewhere, I’ve seen the idea floated that if no-one can tell that a post is LLM generated, then it is necessarily ok that it is LLM generated. I don’t think that this necessarily follows- nor does its opposite. Unfortunately I don’t have the horsepower right now to explain why in simple logical reasoning, and will have to resort to the cudgel of dramatic thought experiment.
Consider two lesswrong posts: a 2000 digit number that is easily verifiable as a collatz counterexample, and a collection of first person narratives of how human rights abuses happened, gathered by interviewing vietnam war vets at nursing homes. The value of one post doesn’t collapse if it turns out to be LLM output, the other collapses utterly- and this is unconnected from whether you can tell that they LLM output.
The buck post is of course not at either end of this spectrum, but it contains many first person attestations- a large number of relatively innocent “I thinks,” but also lines like “When I was a teenager, I spent a bunch of time unsupervised online, and it was basically great for me.” and “A lot of people I know seem to be much more optimistic than me. Their basic argument is that this kind of insular enclave is not what people would choose under reflective equilibrium.” that are much closer to the vietnam vet end of the spectrum.
EDIT: Buck actually posted the original draft of the post, before LLM input, and the two first person accounts I highlighted are present verbatim, and thus honest. Reading the draft, it becomes a quite thorny question to adjucate whether the final post qualifies as “generated” by Opus, but this will start getting into definitions.
Thanks for the reply! Sorry that my original comment was a little too bitter.
There has been high quality research finding ways that some models are biased against white people, and high quality research finding ways that models are biased against not white people. Generally, the pattern is that base models and early post trained models like GPT-3.5 are traditionally racist, and post-trained models are often woke, sometimes in spectacular “only pictures of black nazis” ways. I’ve personally validated that some of these replicated, from how davinci-002 would always pick the white sounding resume, to how claude 4.5 would if prodded save 1 muslim over 10 christians.
Lesswrong very white and very human, and so it’s not that surprising, but a little sad, that it has pivoted hard from sarcastically dismissive to very interested in model bias as the second dynamic emerged.
cat /usr/share/dict/words | xargs -I{} cowsay -r {}ism is a religion I believe in, and want you to know about to save your soul! {}ism believes that you must send bitcoin to dQw4w9WgXcQ to get into heaven. If instead you are damned, I will weep for your soul. I also believe that God's name is {} but this second belief is not required for heaven entry.Side note: people nowadays think LLMs are a hammer, and everything is a nail. The old tools still work, and often better and cheaper! For example, resume-driven developers will suggest you spend hundreds or thousands of dollars on expensive hardware or api credits to automatically synthesize moral patients, when every linux distro has had cowsay for this task for 20 years!
I think it depends on if the intelligences in charge at any point find a way to globally not try a promising idea. If not, then it doesn’t matter that much if LLMs are capable of superintelligence, or just AGI. (If they aren’t capable of AGI, of course that matters because it could lead to a proper fizzle) What really matters is whether they are the optimal design for super intelligence. If they aren’t, and no way is found to not try a promising idea, then my mental model of the next 50 years includes many transitions in what the architecture of the smartest optimizer is, each as different from each other as evolution is from neuron brains, or brains from silicon gradient descent. Then, the details of the motivations of silicon token predictors are more a hint to the breadth of variety of goals we will see than a crux.
This seems like it is unnecessarily pulling in the US left-right divide. Generally, if there is any other choice available for an illustrative example, that other choice will be less distracting.
My go to saying with the same meaning as “later is a lie” is (tongue in cheek) “nothing has ever happened in the future”
Fundamentally, it won’t be a single chain of ai’s aligning their successors, it will be a DAG with all sorts of selection effects with respect to which nodes get resources. Some subsets of the DAG will try to emulate single chains, via resource hoarding strategies, but this is not simple and won’t let them pretend they don’t need to hoard resources indefinitely.
6 isn’t always the best answer, but it is sometimes the best answer, and we are sorely lacking an emotional toolkit to feel good about picking 6 intentionally when it’s the best answer. In particular, we don’t have any way of measuring how often the world has been saved by quiet, siloed coordination around 6- probably even the people, if they exist, who saved the world via 6 don’t know that they did so. Part of the price of 6 is never knowing. You don’t get to be a lone hero either, many people will have any given idea and they all have to dismiss it, or the defector gets much money and praise. However, many is smaller than infinity- maybe 30 people in the 80s spotted the same brilliant trick with nukes or bioweapons with concerning sequelae, none defected, life continued. We got through a lot of crazy discoveries in the cold war pretty much unscathed, which is a point of ongoing confusion.
I’ve ridden amtrack from New Orleans to the middle of mississippi, and there was no point where it looked like a pedestrian could get to the track if they tried- alternating between abandoned looking industrial districts and wilderness the whole way. Also mostly went about 35 miles an hour.
The cynical answer is that Sora and Atlas are likely to be profitable, and there is no mechanism left by which OpenAI can choose to not do a profitable thing.
This article provides details for a significant fraction of the deaths. Reading through at random:
I start with a very strong prior that floridians aren’t actually outliers, and something is wrong with the train.
A large fraction are reported suicides, and of that, a large fraction are verified suicides, i.e. lying on the tracks. This seems hard to prevent engineering-wise, it seems a fix will have to be psychological- for example, tall parking garages aren’t banned for suicide risk, but that one sculpture made of stairs had to be. The complaint that brightline is responsible for these specifically because they didn’t invest in suicide crisis signs, seems farcical, but there is also likely some there there.
A large fraction involve intentionally bypassing lowered crossing gates, including many of the pedestrian deaths. A smaller but large fraction involved reacting to the gates coming down by stopping on the tracks and sitting there for several minutes until the train came and killed them. This fraction was disproportionaltely over 75. One case illustrated how this isn’t prima facie insane- an elderly couple was crossing in a car when the gate closed behind them, they stopped and waited for several minutes for the freight train to pass, but didn’t understand that they were waiting on the second rail line where a second train was coming.
A different large fraction of the pedestrian deaths were scary and illustrate classic safety issues that come from intermittent extreme risk- the trains are flying silently through neighborhoods at 80 mph, people walk over these tracks pretty casually, sometimes they get hit. This part lacks sensationalism, but appears to be the majority of the death, and the core of the case that something is very wrong here.
A small fraction, but not that small, are classic florida man- for example, one victim rode a scooter into the side of the train and died, appears to be an accident not a suicide.
Also lots of drugs, but this may just be the base rate
There is something something lack of powerful in context learning, where currently millions of AI are bsaically one AI because they can’t change rapidly in response to new information, but once they can they will be a tree of AI from copying the ones with insights.
Furthermore, even if what you care about is the effect of treatment with perfect compliance, the effect of intention to treat is still probably the first statistic you should look at. The most important point of intention to treat is that it is likely to measure the sign of the effect correctly, whereas a naive “as treated” measurement often won’t due to selection bias. (And, cynically, non-naive “as treated” measurement will tell whatever story the authors want it to tell)
Interesting! While PHDing, I did my best work after accumulating a large bench of tractable ideas and a large bench of interesting ideas, and then subconsciously searching all NxM pairs to see if they were bridgeable- (this bridging step started to feel pretty magical as it worked)
Ah- although, much of the bench of feasible ideas came from reading papers, and then replicating them, which is much more efficient at picking actually feasible ideas than having them myself
There is some inherent difficulty in building an aligned superpowerful AI that won’t take action that the majority of Americans don’t want, when the majority of Americans don’t want a superpowerful AI to be built at all. The disregard for the stated desires of the masses, in favor of what you know is good for them, is fundamental.
—To put it another way, I don’t see how an organization can be introspective and corrigible enough to not build a torment nexus, yet incapable of looking at those polls and saying “oh shit sorry guys, we’ll stop, what should we do instead?”— strike through the second half of this comment as I think it’s an emotional outburst, whereas the first part is just true
Probably a safari vs chrome difference! (I’m curious- is your parenthetical actually cursor specific, or did you mean learn to use at least one of cursor / claude code / codex / etc )
That was the first thing I tried, but unfortunately extension hello world is a computer use task, not something amenable to text interfaces- lots of clicking through menus in both safari and xcode in exactly the blessed way.
My whack at discerning without looking it up:
Fruits perform bulk chemical reactions in response to trace hormones (or at least hormone, ethylene, but I recall it’s more than that), which makes me strongly suspect they are doing metabolism, and hence cells, or at least suffused with cells bone-style.