Yeah, if you are doing e.g. a lab heavy premed chemistry degree my advice may not apply to an aspiring alignment researcher. This is absolutely me moving the goalposts, but may also be true: on the other hand, if you are picking courses with purpose, in philosophy, physics, math, probability, comp sci: theres decent odds imho that they are good uses of time in proportion to the extent that they are actually demanding your time.
Hastings
For undergrad students in particular, the current university system coddles. The upshot is that if someone is paying for your school and would not otherwise be paying tens of thousands of dollars a year to fund an ai safety researcher, successfully graduating is sufficiently easy that its something you should probably do while you tackle the real problems, in the same vein as continuing to brush your teeth and file taxes. Plus you get access to university compute and maybe even advice from professors.
This is fascinating- I’ve been a fan of Joseph’s youtube channel for years, but I’ve never seen him comment on lesswrong. A while ago in that setting, we got into a back and forth about eigenvalues of anti-linear operators, which was object level fascinating, but also ended up requiring both Joseph and me to notice that we were wrong, which we did with little difficulty. What I’m trying to say is that Joseph is actually smart and open minded on technical questions, but is also definitely not respecting community norms here. If we can successfully not scare him off while discouraging quite this level of vitriol, there is definitely potential for him to contribute.
The evidence I am about to ask for may exist! However, I am still comfortable asking for it, as without it the whole thing falls apart, and I think this class of argument really always needs to show this explicitly: can you show that literally anyone reads the “modern Centennial Edition of Etiquette released in 2022?”
Also, I agree that Israel made great use of drones to take out anti-air defenses. However, this use of drones in no way requires manufacturing millions of quadcopters.
A couple notes: israel and russia are extremely comparable in military spending, likewise Ukraine and Iran. In addition, Ukraine and Iran both went hard into drones to counter the disparity, its very noticable that drones basically work against Russia and basically don’t against Israel- but neither conflict provides dramatically more evidence than the other about a war where both sides are well-resourced and nuclear. In particular, the lack of drone based attrition of the israeli airforce is glaring.
The lesson I drew from Israel vs Iran is that stealth just hard-counters drones in a peer conflict. The essential insight is that guided bombs aren’t just similar to small suicide drones, they are drones- with all the advantages- as long as you can get a platform in place to drop them
Yep, sarcastic. Sorry, someday I’ll learn not to do that on the Internet, but I’m not holding my breath.
It sounds like they don’t filter out the canary string from general web text even though it would be cheap and filter out a really tiny proportion of the training data. Maybe that tiny proportion yields a disproportionate boost in performance. Hmm, wonder why that could be.
The youtube algorithm is powerfully optimizing for something, and I don’t trust that at all with my child. However, in a fit of hubris, for a minute I thought that I could outsmart it and get what I want (time to clean the kitchen) without it getting what it wanted (I make no strong claims about what the youtube algorithm wants, but it tries very hard to get it, and I don’t want it to get it from my three year old).
I searched for episodes of PBS’s Reading Rainbow, but let the algorithm freely choose the order of returned results, and then vetted that the first result was a genuine episode. I also put it in “Kids” mode, in the hopes that it would be kinder to a child than an adult.
This was way too much freedom. It immediately pulled out the episode of Reading Rainbow about the 9/11 terrorist attacks (this topic is not at all indicated by the title or thumbnail)
I think this is largely right point by point, except that I’d flag that if you are rarely using eigendecomposition (mostly at the whiteboard, less so in code), you are possibly bottlenecked by a poor grasp of eigenvectors and eigenvalues.
Also, a fancy linear algebra education will tell you exactly how matrix log and matrix exponent work, but all you need is that 99% of the time any number manipulation you can do with regular logs and exponents will work completely unmodified with square matrices and matrix logs and exponents, but if you don’t know about matrix logs at all this will be a glaring hole: I use these constantly in actual code. ( Actually 99% is definitely sampling bias- for example, given matrices A and B, log(AB) only equals log(A) + log(B) if A and B share eigenvalues, and them being numerically equal may require being tricky about which branch of the log to pick, and my pleading may fall on deaf ears that well of course, but you’d only think to try it if they share eigenvalues and you’re doing an operation later that kills branch differences so in practice when you try it it works)
The reprogenetics case is almost a classic case of epistemic learned helplessness (https://slatestarcodex.com/2019/06/03/repost-epistemic-learned-helplessness/ for new readers) but with the twist that novel arguments about reproduction genetics have a long history of persuading experts in ethics into regrettable actions, suggesting that it’s not total madness to apply epistemic learned helplessness here even if you consider yourself an expert.
I exaggerate: Imagine Charlie Brown, presented with a formal proof in Lean that Lucy won’t pull away the football. He has checked that the proof compiles on his computer, he has passed it by Terence Tao in person, who vouched for it. He has passed the connection between the lean proof and the real world to a team of lawyers, physicists, and operating system engineers, who all see no holes. It is still rational to not try to kick the football.
Interesting. A nitpick- once llm text enters into the pretraining dataset with diverse causal graphs at least as complex as (prompt → LLM generation → post processing), the models are heavily incentivized in the pretraining phase to model LLM internals. Pretraining on this kind of text introduces lots of tasks like “which model made this text?” “what was the prompt?” “was the model degraded by too much context?” “is this the output of a LORA or a full finetune?” etc. (this assumes an oracle answering these questions lets you predict the next token more accurately, which seems exceedingly likely). I expect this effect to be much more robust if induced by webtext containing llm content, with extreme diversity in causal graphs, than from e.g. the gpt-oss training dataset with a single or small number of underlying production mechanisms
My whack at discerning without looking it up:
Fruits perform bulk chemical reactions in response to trace hormones (or at least hormone, ethylene, but I recall it’s more than that), which makes me strongly suspect they are doing metabolism, and hence cells, or at least suffused with cells bone-style.
What this sounds like to me is a system, where all of the parts are designed cleverly with the assumption that their costs are going to be amortized, but because the current reality isn’t fitting the original assumptions, you’re hitting cache misses and full cost on every single operation
In the discussion of the buck post and elewhere, I’ve seen the idea floated that if no-one can tell that a post is LLM generated, then it is necessarily ok that it is LLM generated. I don’t think that this necessarily follows- nor does its opposite. Unfortunately I don’t have the horsepower right now to explain why in simple logical reasoning, and will have to resort to the cudgel of dramatic thought experiment.
Consider two lesswrong posts: a 2000 digit number that is easily verifiable as a collatz counterexample, and a collection of first person narratives of how human rights abuses happened, gathered by interviewing vietnam war vets at nursing homes. The value of one post doesn’t collapse if it turns out to be LLM output, the other collapses utterly- and this is unconnected from whether you can tell that they LLM output.
The buck post is of course not at either end of this spectrum, but it contains many first person attestations- a large number of relatively innocent “I thinks,” but also lines like “When I was a teenager, I spent a bunch of time unsupervised online, and it was basically great for me.” and “A lot of people I know seem to be much more optimistic than me. Their basic argument is that this kind of insular enclave is not what people would choose under reflective equilibrium.” that are much closer to the vietnam vet end of the spectrum.
EDIT: Buck actually posted the original draft of the post, before LLM input, and the two first person accounts I highlighted are present verbatim, and thus honest. Reading the draft, it becomes a quite thorny question to adjucate whether the final post qualifies as “generated” by Opus, but this will start getting into definitions.
Thanks for the reply! Sorry that my original comment was a little too bitter.
There has been high quality research finding ways that some models are biased against white people, and high quality research finding ways that models are biased against not white people. Generally, the pattern is that base models and early post trained models like GPT-3.5 are traditionally racist, and post-trained models are often woke, sometimes in spectacular “only pictures of black nazis” ways. I’ve personally validated that some of these replicated, from how davinci-002 would always pick the white sounding resume, to how claude 4.5 would if prodded save 1 muslim over 10 christians.
Lesswrong very white and very human, and so it’s not that surprising, but a little sad, that it has pivoted hard from sarcastically dismissive to very interested in model bias as the second dynamic emerged.
cat /usr/share/dict/words | xargs -I{} cowsay -r {}ism is a religion I believe in, and want you to know about to save your soul! {}ism believes that you must send bitcoin to dQw4w9WgXcQ to get into heaven. If instead you are damned, I will weep for your soul. I also believe that God's name is {} but this second belief is not required for heaven entry.Side note: people nowadays think LLMs are a hammer, and everything is a nail. The old tools still work, and often better and cheaper! For example, resume-driven developers will suggest you spend hundreds or thousands of dollars on expensive hardware or api credits to automatically synthesize moral patients, when every linux distro has had cowsay for this task for 20 years!
I wonder if we need someone to distill and ossify postmodernism into a form that rationalists can process if we are going to tackle the problems postmodernism is meant to solve. A blueprint would be the way that FDT plus prisoners dilemma ossifies sartre’s existentialism is a humanism, at some terrible cost to nuance and beauty, but the core is there.
My suspicion of what happened at a really high level is that fundamentally one of the driving challenges of postmodernism is to actually understand rape, in the sense that rationalism is supposed to respect: being able to predict outcomes, making the map fit the territory etc. EY is sufficiently naive of postmodernism that the depictions of rape and rape threats in Three Worlds Collide and HPMOR basically filtered out anyone with a basic grasp of postmodernism from the community. There’s an analagous phenomenon where when postmodernist writers depicts quantum physics, they do a bad enough job that it puts off people with a basic grasp of physics from participating in postmodernism. Its epistemically nasty too: this comment is frankly low quality, but if I understood postmodernism well enough to be confident in this comment I suspect I would have been sufficiently put off by the draco-threatens-to-rape-luna subplot in HPMOR to have never actually engaged with rationalism.