- philosophy has worse short feedback loops than eg ML engineering → in all sorts of processes like MATS or PIBBSS admissions it is harder to select for philosophical competence, also harder to self-improve - incentives: obviously stuff like being an actual expert in pretraining can get you lot of money and respect in some circles; even many prosaic AI safety / dual use skills like mech interpretability can get you maybe less money than pretraining, but still a lot of money if you work in AGI companies, and also decent ammount of status in ML community and a AI safety community; improving philosophical competence may get you some recognition but only among relatively small and weird group of people - the issue Wei Dai is commenting on in the original post, founder effects persist to this day & also there is some philosophy-negative prior in STEM— idk, lack of curiousity? llms have read it all, it’s easy to check if there is some existing thinking on a topic
There’s a deeper problem, how do we know there is a feedback loop?
I’ve never actually seen a worked out proof of well any complex claim on this site using standard logical notation…(beyond pure math and trivial tautologies)
At most there’s a feedback loop on each other’s hand wavey arguments that are claimed to be proof of this or that. But nobody ever actually delivers the goods so to speak such that they can be verified.
AI doing philosophy = AI generating hands, plus the fact that philosophy is heavily corrupted by postmodernism to the point where twoauthors write books dedicated to criticism of postmodernism PRECISELY because their parodies got published.
I think I meant a more practical / next-steps-generating answer.
I don’t think “academia is corrupted” is a bottleneck for a rationalist Get Gud At Philosophy project. We can just route around academia.
The sorts of things I was imagining might be things like “figure out how to teach a particular skill” (or “identify particular skills that need teaching”, or “figure out how test whether someone has a particular skill), or “solve some particular unsolved conceptual problem(s) that you expect to unlock much easier progress.”
In your mind what are the biggest bottlenecks/issues in “making fast, philosophically competent alignment researchers?”
[low effort list] Bottlencks/issues/problems
- philosophy has worse short feedback loops than eg ML engineering → in all sorts of processes like MATS or PIBBSS admissions it is harder to select for philosophical competence, also harder to self-improve
- incentives: obviously stuff like being an actual expert in pretraining can get you lot of money and respect in some circles; even many prosaic AI safety / dual use skills like mech interpretability can get you maybe less money than pretraining, but still a lot of money if you work in AGI companies, and also decent ammount of status in ML community and a AI safety community; improving philosophical competence may get you some recognition but only among relatively small and weird group of people
- the issue Wei Dai is commenting on in the original post, founder effects persist to this day & also there is some philosophy-negative prior in STEM—
idk, lack of curiousity? llms have read it all, it’s easy to check if there is some existing thinking on a topic
Do you have own off-the-cuff guesses about how you’d tackle the short feedbackloops problem?
Also, is it more like we don’t know how to do short feedbackloops, or more like we don’t even know how to do long/expensive loops?
There’s a deeper problem, how do we know there is a feedback loop?
I’ve never actually seen a worked out proof of well any complex claim on this site using standard logical notation…(beyond pure math and trivial tautologies)
At most there’s a feedback loop on each other’s hand wavey arguments that are claimed to be proof of this or that. But nobody ever actually delivers the goods so to speak such that they can be verified.
(Putting the previous Wei Dai answer to What are the open problems in Human Rationality? for easy reference, which seemed like it might contain relevant stuff)
AI doing philosophy = AI generating hands, plus the fact that philosophy is heavily corrupted by postmodernism to the point where two authors write books dedicated to criticism of postmodernism PRECISELY because their parodies got published.
I think I meant a more practical / next-steps-generating answer.
I don’t think “academia is corrupted” is a bottleneck for a rationalist Get Gud At Philosophy project. We can just route around academia.
The sorts of things I was imagining might be things like “figure out how to teach a particular skill” (or “identify particular skills that need teaching”, or “figure out how test whether someone has a particular skill), or “solve some particular unsolved conceptual problem(s) that you expect to unlock much easier progress.”