I’d definitely recommend running some empirical experiments. I have been in a similar boat where I sort of feel like theorizing and feel an ugh field around doing the hard work of actually implementing things.
Fortunately now we have Claude Code, so in theory, running these experiments can be a lot easier. However, you still have to pay attention and verify its outputs, enough to be very sure that there isn’t some bug or misspecification that’ll trip you up. This goes for other AI outputs as well.
The thing is, people are unlikely to read your work unless you show something clearly surprising or impressive. Like, if you claim to have a solution for an outstanding mathematical problem or a method that is SotA on some benchmark, people are much more likely to pay attention. Even then, people will apply a lot of scrutiny to make sure your claims are true, and you really have to do your homework ruling out every other possibility.
The demand for a high degree of rigor, especially from unproven researchers, is a reality of research. If people didn’t apply strict heuristics for what papers to read, whether via the researcher’s credentials or the strength of their claims, they’d be spending tons of time trying to understand papers that turn out to be pretty insubstantial.
I don’t know enough to properly evaluate the math in your paper—like, I don’t know what the Kantorovich-Rubinstein duality is. (Figuring out whether it makes sense to use it here would probably take a nontrivial amount of effort even for a more mathematically-inclined reader. I think getting ML people to read through a highly math-heavy paper may be especially difficult.)
The lack of citations is concerning to me, since it implies you don’t really know what other people have had to say on this topic, and what your paper contributes beyond that baseline. Using AI is sometimes fine, but you really have to do the cognitive legwork yourself—citing ChatGPT as a coauthor implies you’re using it for a lot more than copyediting or lit review. And reading the text of the Croissant Principle itself, it seems pretty obvious? “Good learning requires generalization, don’t overfit to individual data points.” It does not make me optimistic about the rest of the paper.
I’m hoping this can be a constructive comment rather than just critical—I guess my first advice would be to start by reading a lot of papers that excite you, seeing how they structure their arguments, and getting a sense of what the important questions in your chosen subfield are. Maybe you have been reading a lot of papers already, but I recommend reading more. Then do those empirical experiments—make a falsifiable prediction which you’re not sure whether it’s true (even after doing a thorough literature review) and go find out if it’s true!
Ideally you could get mentorship too, which is sort of a chicken-and-egg problem since you generally have to have some legible credentials in order for a mentor to want to spend their time helping you. I think SPAR is pretty good for early-career researchers though.
Ultimately I think much of this boils down to “put in a lot of (sometimes unpleasant) work to get better at research.” This recent thread also summarizes some good research advice. Best of luck!
Yeah I was originally envisioning this as an ML theory paper which is why it’s math-heavy and doesn’t have experiments. Tbh, as far as I understand, my paper is far more useful than most ML theory papers because it actually engages with empirical phenomena people care about and provides reasonable testable explanations.
Ha, I think some rando saying “hey I have plausible explanations for two mysterious regularities in ML via this theoretical framework but I could be wrong” is way more attention-worthy than another “I proved RH in 1 page!” or “I built ASI in my garage!”
Mmm, I know how to do “good” research. I just don’t think it’s a “good” use of my time. I honestly don’t think adding citations and a lit review will help anybody nearly as much as working on other ideas.
PS: Just because someone doesn’t flash their credentials, doesn’t mean they don’t have stellar credentials ;)
Rereading at your LessWrong summary, it does feel like it’s written in your own voice, which makes me a bit more confident that you do in fact know math. Tbh I didn’t get a good impression from skimming the paper, but it’s possible you actually discovered something real and did in fact use ChatGPT mainly for editing. Apologies if I am just making unfounded criticisms from the peanut gallery
I’d definitely recommend running some empirical experiments. I have been in a similar boat where I sort of feel like theorizing and feel an ugh field around doing the hard work of actually implementing things.
Fortunately now we have Claude Code, so in theory, running these experiments can be a lot easier. However, you still have to pay attention and verify its outputs, enough to be very sure that there isn’t some bug or misspecification that’ll trip you up. This goes for other AI outputs as well.
The thing is, people are unlikely to read your work unless you show something clearly surprising or impressive. Like, if you claim to have a solution for an outstanding mathematical problem or a method that is SotA on some benchmark, people are much more likely to pay attention. Even then, people will apply a lot of scrutiny to make sure your claims are true, and you really have to do your homework ruling out every other possibility.
The demand for a high degree of rigor, especially from unproven researchers, is a reality of research. If people didn’t apply strict heuristics for what papers to read, whether via the researcher’s credentials or the strength of their claims, they’d be spending tons of time trying to understand papers that turn out to be pretty insubstantial.
I don’t know enough to properly evaluate the math in your paper—like, I don’t know what the Kantorovich-Rubinstein duality is. (Figuring out whether it makes sense to use it here would probably take a nontrivial amount of effort even for a more mathematically-inclined reader. I think getting ML people to read through a highly math-heavy paper may be especially difficult.)
The lack of citations is concerning to me, since it implies you don’t really know what other people have had to say on this topic, and what your paper contributes beyond that baseline. Using AI is sometimes fine, but you really have to do the cognitive legwork yourself—citing ChatGPT as a coauthor implies you’re using it for a lot more than copyediting or lit review. And reading the text of the Croissant Principle itself, it seems pretty obvious? “Good learning requires generalization, don’t overfit to individual data points.” It does not make me optimistic about the rest of the paper.
I’m hoping this can be a constructive comment rather than just critical—I guess my first advice would be to start by reading a lot of papers that excite you, seeing how they structure their arguments, and getting a sense of what the important questions in your chosen subfield are. Maybe you have been reading a lot of papers already, but I recommend reading more. Then do those empirical experiments—make a falsifiable prediction which you’re not sure whether it’s true (even after doing a thorough literature review) and go find out if it’s true!
Ideally you could get mentorship too, which is sort of a chicken-and-egg problem since you generally have to have some legible credentials in order for a mentor to want to spend their time helping you. I think SPAR is pretty good for early-career researchers though.
Ultimately I think much of this boils down to “put in a lot of (sometimes unpleasant) work to get better at research.” This recent thread also summarizes some good research advice. Best of luck!
Yeah I was originally envisioning this as an ML theory paper which is why it’s math-heavy and doesn’t have experiments. Tbh, as far as I understand, my paper is far more useful than most ML theory papers because it actually engages with empirical phenomena people care about and provides reasonable testable explanations.
Ha, I think some rando saying “hey I have plausible explanations for two mysterious regularities in ML via this theoretical framework but I could be wrong” is way more attention-worthy than another “I proved RH in 1 page!” or “I built ASI in my garage!”
Mmm, I know how to do “good” research. I just don’t think it’s a “good” use of my time. I honestly don’t think adding citations and a lit review will help anybody nearly as much as working on other ideas.
PS: Just because someone doesn’t flash their credentials, doesn’t mean they don’t have stellar credentials ;)
Rereading at your LessWrong summary, it does feel like it’s written in your own voice, which makes me a bit more confident that you do in fact know math. Tbh I didn’t get a good impression from skimming the paper, but it’s possible you actually discovered something real and did in fact use ChatGPT mainly for editing. Apologies if I am just making unfounded criticisms from the peanut gallery
Oh yes I do know math lol. Yeah the summary above hits most of the main ideas if you’re not too familiar with pure math.