I think some examples of this in french Wikipedia would be useful
Yair Halberstadt
Many branches of engineering were based on initial theoretical breakthroughs followed by engineering trial and error.
We understood how rockets worked on a theoretical level long before we tried to build them. It’s reasonable to assume we would understand how intelligence works before we managed to build it.
I think theoretical alignment (as opposed to applied alignment, working with current models) is in a slump right now because it’s hard to see how it slots in to the current LLM paradigm.
There was an assumption 5 years ago that AGI requires fundamental insights into intelligence. There was a hope that understanding intelligence better on a theoretical level could also point us towards how to steer such an AGI on the direction we wanted.
That assumption seems to be false. It turns out you can build human level intelligence without understanding the first thing about intelligence just by throwing enough compute and data at the problem. Sure, architecture is important, but architectural improvements are driven far more by trial and error, and informed by a narrow understanding of how LLMs works, than by a grand unified theory of intelligence.
In such a world it’s difficult to see how abstract alignment work slots in. Instead we develop alignment the same way as we develop intelligence: trial and error driven by a narrow understanding of LLMs and current architectures rather than a grand unified theory of alignment.
In that world, good alignment research asks narrow questions like: “can we tell whether an LLM is going off the rails just by monitoring it’s COT”, rather than broad questions like “how do we know whether an LLMs utility function exactly matches humanities coherent extrapolated volition”.
Learning to spend money
Thanks, yes I enjoyed that article
I’m ok with context, which summarises the link. Less ok with a bait and switch.
Hi this seemed interesting, but I’m not looking for less wrong to become a site with tantalising snippets which require you to finish off on a different site, so down voted for that reason.
Thank you for this extremely interesting article!
I’m interested if you have an overview of other modern cancer therapies that are being worked on, and whether you think they’re promising?
They can. But yes, it doesn’t work great. Adjust the metaphor as necessary
Imagine that you forgot everything about your current company completely between days, not just current task. Every day is like your first day on the job (coming as an experienced dev from different companies). But you can store 1 million words of notes. The notes aren’t carefully curated—they’re just the last million words you happened to write in your notes, and you take notes on everything. Do you still think you’d make decent progress?
I’ve been wanting to write a very similar piece for a while, and you’ve done a far better job than I would have.
I would think of it as: as the raw material becomes a greater percentage of the total cost, decreasing total cost becomes harder, both in relative terms (because it’s impossible to halve total cost if raw material is already the majority of the cost) and in absolute terms (because if any decrease in system complexity causes the raw material to be used less efficiently, that will erode savings from the decreased complexity).
In that sense the idiot index acts like pressure making it harder to squeeze costs when they’re low but with very little impact when they’re high.
It doesn’t seem to me that the idiot index can be used to predict prices the way you’re using:
For example, currently you’re using an estimate of an idiot index of 100 for fusion, and fuel costs at $4K/kg for Deuterium.
Lets say Deuterium was twice as cheap or expensive. Would your estimate for fusion costs halve or double? If so why? The cost of Deuterium provides almost no evidence of the cost of the complicated parts of building a fusion reactor.
I agree the idiot index is useful for establishing a minimum threshold for cost, and is a useful indication for which processes have a lot of potential savings with further optimisation and those that don’t. But I don’t see any justification for using it to predict future costs.
Something that feels directionally correct to me:
If given a query at the end of a long context, and the same query along with just a summary of all relevant information to the query in that context, how different are the responses?
In general we don’t want extraneous details in the context to impact the response to a new query
Very little, but realising that implies you have a gears level understanding of what’s actually happening in biology rather than just being able to solve the problems you get asked in exams.
Reminds me of https://calteches.library.caltech.edu/46/2/LatinAmerica.htm
In that case I think it depends what the alternative is: if a sort of parody of psychopathy and self interest where we’d push someone onto a railway track if they were in our way and that was quicker than moving them away, then obviously that would be bad.
If more myopic: people still lend some salt to a neighbour asks without running explicit cost-benefit calculations, and don’t like stealing or hurting people for the most part, just they won’t give up their jobs to run a charity, or risk their lives to go to war, or become doctors for a less than competitive salary, then I think overall the world probably ends up better off.
What about crusades, Jihads, revolutions, genocides, political oppression, wars of nationalist conquest, etc.?
No point in particular just trying to describe what physically happens when you burn calories.
Also it depends what exercise you do: I went for a 3 hour road cycle today, which I expect burnt ~1500 calories. If I were attempting to lose weight, and did that twice a week (easily doable for me) , and kept food intake constant, I would lose about a kilo a month that way.
In general with exercise if you want to burn calories it needs to be something you can do continuously for a significant amount of time at a medium-high intensity. Cycling is a great option, as is through-hiking. Doing 30 minutes at the gym just isn’t going to cut it, even if intensity is higher, because it’s just too short to matter—even if you basically kill yourself you’ll only burn 500 calories.
We all know that burning calories causes you to lose weight, but how does that work mechanically?
Fats are long hydrocarbons, consisting pretty much of hydrogen and carbon in an approximately 2 to 1 ratio. There’s some other stuff, but that doesn’t matter.
When you burn them for energy, you react the hydrogen and carbon with oxygen from the atmosphere, producing H2O and CO2. The CO2 is expelled back into the atmosphere. Some of the H2O is also breathed out as vapour, but some hangs around till you pee it out.
Focusing on the H2O that doesn’t vapourise, from our bodies perspective we’ve replaced one carbon per two hydrogens with one oxygen per two hydrogens. Since molcular mass of H2O is 18 and H2C is 14, this counteracts some of the effect of the breathed out H2O and CO2 till we go to the toilet and pee it out.
So why does our weight usually decrease significantly directly after exercise, even without urinating? Because we sweat a lot during exercise and that usually more than compensates for the water we’ve burnt—for context every 1000 calories we burn produces just about 100g of water, and anyone who exercises hard enough to burn 1000 calories will sweat out around a litre or more.
It would also be great if countries stopped going to war, governments followed sound economic policy, and we started a Manhattan project to solve aging, but none of those seem very likely.
How concretely do you plan to persuade any of the frontier labs to shut down?