Rank: #10 out of 4859 in peer accuracy at Metaculus for the time period of 2016-2020.
ChristianKl
Consider the AI doomer position. They believe that AI models are fundamentally constrained by intelligence, and recursive self-improvement will enable AI models to infinitely improve themselves until they attain godlike levels of intelligence (and thus capability).
This is a misframing. AI does not need to infinitely improve themselves to godlike levels to be capable of out maneuvering human beings and cause human extinction. It’s one scenario, but people like Eliezer don’t think that it’s the only scenario.
So far, it seems like the majority of recent (~late 2024) AI gains came from inference-scaling—the amount of compute used every time a model answers a question—as opposed to training, regardless of whether that training is pre-training or post-training.
This thesis is basically: “The fact that we know have agents that can code well has little to do with the training data that comes out of AI deployment.” I’m not sure why it seems that way to you.
AlphaGo did improve by playing against itself. If you have a coding agent that gets feedback about the results of it’s coding work, that does give you training data on which you can train there’s a recursive element to it. The more agents get used in environments where you get feedback about the quality of the output, the more they can be recursively trained on those domains.
Wouldn’t most other alternative routes to AGI also need GPU’s? I would expect less GPU datacenter available also make it harder to pursue neuromorphic AI by putting a large amount of compute into it to scale it up?
This article tries to argue the thesis “data driven self improvement isn’t able to be performed” and then argues why a particular approach to data driven self improvement is bad.
From November 2024 to the end of November 2025 I had a membership in a dance school and was going regularly. In Dezember 2025 is wasn’t going anymore and was moving less. Through the whole of Dezember my resilience values as measured by Oura were dropping and at Christmas I got very ill.
If I would have reacted sooner to the data it might have been quite good for my health. There wouldn’t have been any need to run a t-test. Just looking at the graph and taking it seriously would have been enough.
There was a time were lowest heart rate at night was normally in the 50 to 60 range. I don’t think I have done any large hikes in the decode before and was going to the birthday of a friend were we went hiking. In the next night my lowest heart rate was 45 (numbers from memory), this is a clear signal that something interesting was happening. No need to run a t-test.
When it comes to lifting weights, I think the leading advice is to track the data for your weight lifting. The point isn’t to run t-tests about the effects of creatine but to see whether you are improving at the lifting exercise you are doing and changing exercises if you don’t. That’s data driven self improvement.
For practical purposes, a curve like this is already quite close to a normal distribution. Many things in our lives follow an (approximately) normal distribution: the height of men
There are a lot more dwarfs than the normal distribution would predict. There are some genetic mutations that have a relatively small effect on height and some that have a really big effect. If the people in your sample populations all have a bunch of genetic mutations that individually have a small effect you get something that looks like the normal distribution. If you however also have people with the mutations for dwarfism that have big effects, the distribution does not look like the normal distribution anymore.
I think if you are doing data driven self improvement, you do care about the outliers in data that are driven by the equivalent of the genetic mutation for dwarfism. If you just see them as noise, I don’t think that’s helpful.
If the guy asks to “help find citations” and there are no actual good ones
When it comes for getting LLMs to help with citations I usually put the LLM in Deep Research mode and ask: “Can you do background research about the claims in this paragraph?” This does manage to reveal when claims in the paragraph are out of touch with the actual research in the field and provides good citations which you wouldn’t get if you ask directly for citations.
If you have what feels like an original idea and there’s a scientific field touching it while you are a lay person, running these kinds of deep research queries is probably a good idea even if you write all the actual words yourself. There are plenty of blog posts where a person has their own idea that has shown to have flaws in actual research or the actual research has already found great terminology for it.
The word reality has a clear meaning in ontological realism. If you lack that background then it feels vague.
This is similar to saying that when someone speaks about something being statistically significant they are vague because significant is a vage word. You actually need to understand something about statistics for the term not to feel vague.
It’s also the kind of action that’s within the Overton window and if passed moves the window.
Ukraine speaking about retaking Crimea wasn’t discouraging Russia from invading, if anything it did the opposite. A world where they would have implemented Minsk II and accepted Crimea as lost, might be a world without the current war.
If a nation wants to maintain robust internal supply chains, it can simply subsidize or pay for that capacity directly in various ways that are much more (economically) efficient than charging tariffs.
Why are subsidies more efficient than tariffs?
However, I suspect that the pro-immigration side is not fundamentally motivated by immigration’s purported economic benefits, which are better understood as fig leaves on a deeper-rooted globalist ideology.
Isn’t globalist ideology a lot about doing what makes eco-brained sense? There’s a reason why the chief globalist event in Davos is called the World Economic Forum. What do you think globalist ideology happens to be that differs from being eco-brained?
I think the general issue is that while people in this community and the AI alignment community have quite seriously thought about epistemology but not about ontology.
There’s nothing vague about the sentence. It’s precise enough that’s it’s a ISO/IEC standard. It’s however abstract. If you have a discussion about Bayesian epistemology, you are also going to encounter many abstract terms.
BFO grew out of the practical needs that bioinformaticians had at around 2000. The biologists didn’t think seriously about ontology, so someone needed to think seriously about it to enable big data applications where unclear ontology would produce problems. Since then BFO has been most more broadly and made into the international standard ISO/IEC 21838-2:2021.
This happens in a field that calls themselves applied ontology. Books like Building Ontologies with Basic Formal Ontology by Robert Arp, Barry Smith, and Andrew D. Spear explain the topic in more detail. Engaging with serious conceptual framework is work but I think if you buy the core claim of ‘I think that people overrate bayesian reasoning and underrate “figure out the right ontology”’ you shouldn’t just try to develop your ontology based on your own naive assumptions about ontology but familiarize yourself with applied ontology. For AI alignment that’s probably both valuable on the conceptual layer of the ontology of AI alignment but might also be valuable for thinking about the ontological status of values and how AI is likely going to engage with that.
After Barry Smith was architecting BFO and first working in bioinformatics he went to the US military to do ontology for their big data applications. You can’t be completely certain what the military does internally but I think there’s a good chance that most of the ontology that Palantir uses for the big data of the military is BFO-based. When Claude acts within Palantir do engage in acts of war in Iran, a complete story about how that activity is “aligned” includes BFO.
There are complex argument that can be made, but the simple one is because GiveWell is saying that those numbers aren’t supposed to be taken literally.
I think it’s quite obvious that malaria bet nets don’t eradicate malaria. You can get malaria by being bitten by a mosquito outdoors. No amount of bet nets is going to do that job. That doesn’t mean that given someone who doesn’t have a bet net a bet net isn’t going to have a huge return.
If you would actually want to eradicate malaria you would need to invest in gene drives. On the other hand it’s hard to measure the impact of the marginal dollar that goes towards gene drives.
The fact that the highest return investment might not be easily measured provides a challenge to EA. The public story about the GiveWell / Open Phil split was that you do at want to provide the people who want clear evidence of impact GiveWell while leaving Open Phil to fund higher impact more speculative investments.
While this is an important issue, I think EA does a reasonable job at thinking about. The more important issue that Ben highlights is that one the one hand GiveWell’s public position is “We don’t view our cost-effectiveness estimates as literally true.” but people in EA frequently talk about them as if the cost-estimates are literally true. Ben makes some arguments based on his time at GiveWell why there process is not build to produce literally true cost-effectiveness estimates, but you can just take GiveWell at it’s word here. Once, you accept that these numbers aren’t literally true, the “clear evidence” argument gets weaker as well. If those numbers aren’t literally true, why not just fund the causes that are more speculative but where you are hoping for bigger returns?
I could also imagine a sloppier person intentionally raising their standards, but that seems a lot harder, or else it’s just something people around me have been less likely to talk about.
If you want convince someone to lower their standards on an intellectual level you just need to convince them that there’s no rational reason for their standards.
On the other hand, if you want someone to raise their standards you actually have to provide them reasons for why a higher standard is important. If you have a situation where someone has some allergic symptoms explaining to them how better cleaning could alleviate their symptoms or the symptoms of a housemate that they care about, that’s an argument that might convince someone that it makes sense to raise their standards.
Jordon Peterson also seems to manage to convince large amounts of men that they should clean their room and if you buy into his reasoning that probably comes with standard raising.
When it comes to good ontology, more people should understand what Basic Formal Ontology is. When it comes to AI alignment, it might be productive if someone writes out a Basic Formal Ontology compatible ontology of it.
If you take the JFK assassination most people don’t know that the last official government investigation came to the conclusion that there was probably a conspiracy to kill JFK (and that they don’t know who exactly was involved). Yet, you have the media telling you that this is a conspiracy theory that nobody should take seriously. There’s a reason why the information environment is structured in a way that most people don’t know about the result of the last official government investigation.
You had the government arguing during COVID that releasing the JFK files would be so damaging to national security that they have to violate the law to postpone the release. This suggests either that they were simply lying or that there’s something significant hidden. Reasoning about what the hidden thing happens to be isn’t trivial. And the media of course just accepted that there’s something in the files that would be so dangerous for national security to release that it warrants law breaking.
So, if a given conspiracy theory is true, then it does take place in an antagonistic epistemic environments—the conspirators are usually trying to misinform.
It more complex than that. At the end of WWII, the US government managed to get information via the Venona project about the Soviets having many spies in US government departments. The US State Departments didn’t really like their employees being persecuted for being possible Russian spies and that likely includes Department leadership that wasn’t completely made up of spies. This dynamic grew to the McCarthy hearings.
There were clearly Russian spies back then, but McCarthy probably pointed at plenty of people who were innocent. It created a lot of social conflict. This is when the term ‘conspiracy theory’ first started it’s rise in usage. It was to say that calling people Communists that conspire to bring down the United States is a conspiracy theory.
The result was to stop the McCarthy persecutions because of the social turmoil and distrust they caused and let the Russian spies get on with their business. This isn’t because the Communists were so politically powerful but because the information environment of talking about conspiracy theories of people being loyal to Moscow destroys trust.
Shielding conspiracies by others is a classic moral maze behavior even by people who aren’t directly co-conspirators.
By that same token, this entire forum should understand my position rather than me its.
Why would anyone care about your position? You seem to care about the position of people in this forum given that you are here. If you don’t care, go somewhere else. Write your own blog.
The point of a forum is to facilitate a shared discourse. If you want to join that discourse the forum is there. If you want to start your own discourse, you are free to set up your own forum or blog.
Your philosophy is not that complex.
It takes less work to familiarize yourself with the philosophic positions of this forum than it takes to develop the physics knowledge necessary to engage in academic physics.
The fact that this needs less work is no good argument for the work not needing to be done.
To approach a question of meaning or politics scientifically in the way you describe is to assume that you know the answers from the start. What if your methodology is inherently flawed, and particularly in such a way as to be blind to the very ways in which it is flawed?
If a method is inherently flawed understanding the method and the reasoning for it’s use important for making a good argument that it’s flawed. If you take physics, there are plenty of people who don’t understand special relativity and you want to argue that it’s flawed. Engaging with those people is not useful for physicists. To the extend that there are flaws in physical theories it takes a lot of understanding of existing physics to make an argument that’s actually useful to bring forward the field of physics.
In philosophy actually understanding the position of the people you want to convince matters as well.
When it comes to salaries of knowledge workers like software engineers, a lot of decisions come down to the decisions of managers who not only care about what’s good for the company but has their own desires as well. A manager prefers employees that are in the office and as near as possible so that they feel they have power over the employees. This goes for middle management as well.
Someone in companies like Google that do have offices in India the internal company politics don’t play out in a way that result in drastically increasing their headcount in India.
The $2.43 Billion Question: Podcast Advertising in 2024: 10% of podcasts are ads (after which conversion rates drop); 30% of people left because of ads; a crazy amount of listeners. I don’t know why hosts (and the people paying them) aren’t a bit more subtle, and instead name drop the product casually throughout the episode. Of course, this may throw off the flow and be too forced.
I think attribution for advertising success happens via referral codes. If you just subtly speak about the product the listener might not actually use your code when buying it.
It also requires more cognitive load to have to think about the ad while doing a show instead of recording an ad once and then letting the person cutting the video cut in the ad.
I don’t think that the belief that godlike intelligence is necessary for human extinction via AI is a popular AI doomer position among people who are intellectually sophisticated. It’s more like those people hold complex position and it’s easy for people who are skeptics to frame this as “a popular position”.
There’s an argument that some of the risk comes from “godlike intelligence” but that’s not necessary to believe in high risk. If you take an agent as smart as the smartest human and able to act faster while the agent can copy themselves after learning skills and able to potentially better coordinate among billions of copies of the agent that might be enough to overpower humanity.
You can’t conclude from the fact that inference scaling happened that most AI improvements are due to scaling.
I’m saying that this is already happening. It’s not as straightforward as with AlphaGo as it’s easier to judge whether a move helps with winning a game in the constrained environment of go, but when it comes to coding you have quality measurements such as whether or not the agent managed to write code that successfully made the unit tests pass and
There’s a lot of training on ‘synthetic data’ and data from user interactions and if you have a better agents that leads to higher data quality for both.
When it comes to inference it’s also worth noting that they found a lot of tricks to make inference cheaper. It’s not just more/better hardware: