Thanks! I agree with most of what you’re saying to one extent or another, but relative to the fairly narrow thing I’m trying to do, I still maintain it’s out of scope.
It seems possible that we’re imagining very different typical readers. When I look at rejected LW posts that were co-written with LLMs, or posts on r/LLMPhysics, I see problems like values of totally different units being added together (‘to the current level of meaningness we add the number of seconds since the Big Bang’). While it’s difficult to settle on a fully satisfying notion of validity, I think most people who have done any work in the sciences are likely to agree that something like that is invalid. My main goal here is to provide a first-pass way of helping people identify whether they’re doing something that just doesn’t qualify as science under any reasonable notion of that. The idea of discouraging a future Feynman is horrifying, but my experience has been that with my suggested prompt, LLMs still do their best to give projects the benefit of the doubt.
Similarly, while my step 2 uses a simplified and limited sense of the scientific method, I think it’s really important that people who feel they’ve made a breakthrough should be thinking hard about whether their ideas are able to make falsifiable predictions that existing theories don’t. While there may be some cases around the edges where that’s not exactly true — eg as Charlie Steiner suggests, developing a simpler theory that makes the same predictions — the author ought to have least given the issue serious consideration, whereas in many of the instances I’ve seen that’s not the case.
I do strongly encourage people to write better posts on this topic and/or better prompts, and I’ll gladly replace this post with a pointer to those when they exist. But currently there’s nothing (that I could find), and researchers are flooded with claimed breakthroughs, and so this is my time-bounded effort to improve on the situation as it stood.
LLMs still do their best to give projects the benefit of the doubt.
There the saying that the key to doing a successful startup is to find an idea that looks stupid but that isn’t. A startup is successful when it pursues a path that other people reject to pursue but that’s valuable.
In many cases it’s probably the same for scientific breakthroughs. The ideas behind them are not pursued because the experts in the field believe that the ideas are not promising on the surface.
A lot of the posts that you find on r/LLMPhysics and rejected LW posts have the feature of sounding smart on the surface to some lay people (the person interacting with the LLM), but that don’t work. LLMs might have the feature of giving the kind of idea that sounds smart to lay people at the surface the benefit of the doubt but the kind of idea that sounds stupid to everyone on the surface evaluation no benefit of doubt.
I think it’s really important that people who feel they’ve made a breakthrough should be thinking hard about whether their ideas are able to make falsifiable predictions that existing theories don’t.
You can get a PHD in theoretical physics without developing ideas that allow you to make falsifiable predictions.
Making falsifiable predictions is one way to create value for other scientists but it’s not the only one. Larry brings the example of “There are 20 people in this classroom” as a theory, that can be novel (nobody in the literature said anything about the amount of people in this classroom) and makes falsifiable predictions (everyone who counts, will count 20 people) but is completely worthless.
Your standard has both the problem that people whom the physics community gives PHDs don’t meet it and also that plenty of work that does meet it is worthless.
I think the general principle should be that before you try to contact a researcher with your idea of a breakthrough, you should let the LLM simulate the answer of that researcher beforehand and iterate based on the objections that the LLM predicts to come from the researcher.
Thanks! I agree with most of what you’re saying to one extent or another, but relative to the fairly narrow thing I’m trying to do, I still maintain it’s out of scope.
It seems possible that we’re imagining very different typical readers. When I look at rejected LW posts that were co-written with LLMs, or posts on r/LLMPhysics, I see problems like values of totally different units being added together (‘to the current level of meaningness we add the number of seconds since the Big Bang’). While it’s difficult to settle on a fully satisfying notion of validity, I think most people who have done any work in the sciences are likely to agree that something like that is invalid. My main goal here is to provide a first-pass way of helping people identify whether they’re doing something that just doesn’t qualify as science under any reasonable notion of that. The idea of discouraging a future Feynman is horrifying, but my experience has been that with my suggested prompt, LLMs still do their best to give projects the benefit of the doubt.
Similarly, while my step 2 uses a simplified and limited sense of the scientific method, I think it’s really important that people who feel they’ve made a breakthrough should be thinking hard about whether their ideas are able to make falsifiable predictions that existing theories don’t. While there may be some cases around the edges where that’s not exactly true — eg as Charlie Steiner suggests, developing a simpler theory that makes the same predictions — the author ought to have least given the issue serious consideration, whereas in many of the instances I’ve seen that’s not the case.
I do strongly encourage people to write better posts on this topic and/or better prompts, and I’ll gladly replace this post with a pointer to those when they exist. But currently there’s nothing (that I could find), and researchers are flooded with claimed breakthroughs, and so this is my time-bounded effort to improve on the situation as it stood.
There the saying that the key to doing a successful startup is to find an idea that looks stupid but that isn’t. A startup is successful when it pursues a path that other people reject to pursue but that’s valuable.
In many cases it’s probably the same for scientific breakthroughs. The ideas behind them are not pursued because the experts in the field believe that the ideas are not promising on the surface.
A lot of the posts that you find on r/LLMPhysics and rejected LW posts have the feature of sounding smart on the surface to some lay people (the person interacting with the LLM), but that don’t work. LLMs might have the feature of giving the kind of idea that sounds smart to lay people at the surface the benefit of the doubt but the kind of idea that sounds stupid to everyone on the surface evaluation no benefit of doubt.
You can get a PHD in theoretical physics without developing ideas that allow you to make falsifiable predictions.
Making falsifiable predictions is one way to create value for other scientists but it’s not the only one. Larry brings the example of “There are 20 people in this classroom” as a theory, that can be novel (nobody in the literature said anything about the amount of people in this classroom) and makes falsifiable predictions (everyone who counts, will count 20 people) but is completely worthless.
Your standard has both the problem that people whom the physics community gives PHDs don’t meet it and also that plenty of work that does meet it is worthless.
I think the general principle should be that before you try to contact a researcher with your idea of a breakthrough, you should let the LLM simulate the answer of that researcher beforehand and iterate based on the objections that the LLM predicts to come from the researcher.