It’s possible to frame this content in a relevant way, to study forecasting of algorithmic improvements (where most of the technical details of the improvements themselves aren’t relevant). Similarly, if it was already published elsewhere (at a given level of accessibility) and well-known to ~saturation, it would’ve been neutral to discuss it.
I think LW shouldn’t be taking steps to advance (open) capability research for its own sake, however trivially. A post being actually good then makes it proportionally worse.
It’s possible to frame this content in a relevant way
I appreciate that this post is clearly (rather than covertly) capabilities. too many posts pretend to be alignment which aren’t. I wouldn’t want OP to dress it up in lies in order to fit it in.
My current view is that alignment of advanced future AI systems will need to be approached from a large number of angles simultaneously: how public perception of AI is managed, how regulatory body’s set incentives for research, how investors direct funds, and how researchers build thoughtful systems and anticipate change to model behavior. I believe I can best contribute by gaining a deep technical understanding of AI systems, such that I can better anticipate how changes to data/architecture/compute will impact behavior. Right now I find that exploring capabilities gives the strongest feedback signal to build this intuition, because the system immediately tells you when your next idea sucks, or when your intuition is off.
I appreciate your willingness to explain your view. Replying with how mine responds to that viewpoint, as a person who was doing something quite similar about 10 years ago and came to my capabilities knowledge that way:
The fact that capabilities gives good feedback signal and alignment does not, seems to me to be much of why we’re finding it difficult to solve alignment. If we knew a thing to line-go-up about was a good thing to line-go-up about to solve alignment, then just line-go-up about it, and you’ve solved alignment! compare to a hypothetical research field starting three thousand years ago, “machine motion”. machine motion studies making machines cause motion. machine motion papers can push forward “make things go fast”, and eventually, someone figures out how to make machines cause a lot of motion all at once and tries it out in the new mexico desert a few decades ago. but, the sister field, machine aim, has less progress. aiming a machine requires making it go a specific direction, and it turns out that, at least for simple machines, making it go at all is easy to measure, but … the metaphor breaks down because the space one aims through for literal throwing is so low dimensional (and relatively low effective lyapunov exponent) compared to the one we need to aim through (which includes all forms of throwing, as well as every other physical interaction downstream of the starkly superintelligent model we eventually build and align.)
I agree that understanding capabilities is very important for having plausible alignment ideas. I don’t agree that trying to push the frontier of a problem, especially when focused on others’ understanding, is a necessary way to do that. I did a lot of keeping up to date of the kind you’re doing here over the years. but even though doing that is normal and important in order to contribute to alignment seriously, I’ve been very careful not to narrate my thoughts on it online, so as to not be a marginal contribution to the frontier unless I can tell whether I’m improving the ratio of alignment outcomes. if everyone did this, there would be little progress on AI except when it was a good idea to do so. the flip side of this is, I don’t think you’re doing a very good job of understanding capabilities, in a similar way to how most people in the field aren’t; but see above for why that’s all I’ll say on that.
It seems to me that in order to matter, alignment work has to be able to work at the frontier. so working with the frontier is important and not a mistake. but I’m not a fan of anything that pushes that frontier. I want to know how to push it in some directions, but those directions involve figuring out how to make loss functions and learning algorithms that quickly and asymptotically organize an AI into a thing that actually works towards indefinite-term good.
I’m optimistic we can define that, likely many of the tools of capabilities will matter, but I think we’ll want to be on the pretty math-heavy end of capabilities to do it right, where you derive your training setup from a theoretical insight. and I’m optimistic that scaling has already put us close to being able to figure out inhumanly hard math questions with the help of an AI that is only locally aligned to solving problems, and have it help us figure out the math of training a successor that is asymptotically aligned to control the world into states where no other AI breaks humans’ autonomy as we enter this new era.
I think we have different viewpoints of what the frontier is. The majority of the 20% improvements mentioned in this post are things I came up with and are pretty surface level. I have only been looking at LLMs for 6 months when I have free time outside work as something to tinker with, and I don’t consider myself an expert, obviously. I would anticipate that the actual research frontier at labs is substantially ahead, such that any moral discussions around this post are akin to debating if a 11th grade Chemistry lab will encourage the creation of nuclear weapons.
I don’t think you’re doing a very good job of understanding capabilities
Part of my hope in posting was to get technical feedback from a crowd that is knowledgeable on AI systems. Curious if you can be more specific on why you believe this.
AI development feels more similar to biology than to chemistry. Bright 11th graders shouldn’t be doing experiments on culturing some previously unculturabke pathogen which would be a good bioweapon target and discussing their results, since the field is wide and shallow and it’s not entirely impossible that their experiments are novel. On the other hand, if they’re running basic experiments on culturing some specific common bacterium (e.g. e coli) better, they probably don’t need to worry about accelerating bioweapon development even if there is a chance of them making a slight advancement to the field of biology as a whole.
The nanogpt speedrun feels more like developing better methods to culture e coli at a hobbyist level, and quite unlikely to lead to any substantial advancement applicable to the operational efficiency of well-funded companies at the frontier. Still, it probably is worth keeping track of when the work you’re doing approaches the “this is actually something novel the frontier labs might use” mark, particularly if it’s something more substantial than “here’s how to use the hardware more efficiently to train this particular model”.
Framing isn’t about being covert, it’s about a particular emphasis on what kinds of considerations are in scope, naturally resulting in almost complete omission of obviously irrelevant (technical) details (and occasionally in comical misunderstanding of content produced from a mutually unintelligible framing).
It’s possible to frame this content in a relevant way, to study forecasting of algorithmic improvements (where most of the technical details of the improvements themselves aren’t relevant). Similarly, if it was already published elsewhere (at a given level of accessibility) and well-known to ~saturation, it would’ve been neutral to discuss it.
I think LW shouldn’t be taking steps to advance (open) capability research for its own sake, however trivially. A post being actually good then makes it proportionally worse.
I appreciate that this post is clearly (rather than covertly) capabilities. too many posts pretend to be alignment which aren’t. I wouldn’t want OP to dress it up in lies in order to fit it in.
OP, I’m curious about your views on alignment.
My current view is that alignment of advanced future AI systems will need to be approached from a large number of angles simultaneously: how public perception of AI is managed, how regulatory body’s set incentives for research, how investors direct funds, and how researchers build thoughtful systems and anticipate change to model behavior. I believe I can best contribute by gaining a deep technical understanding of AI systems, such that I can better anticipate how changes to data/architecture/compute will impact behavior. Right now I find that exploring capabilities gives the strongest feedback signal to build this intuition, because the system immediately tells you when your next idea sucks, or when your intuition is off.
I appreciate your willingness to explain your view. Replying with how mine responds to that viewpoint, as a person who was doing something quite similar about 10 years ago and came to my capabilities knowledge that way:
The fact that capabilities gives good feedback signal and alignment does not, seems to me to be much of why we’re finding it difficult to solve alignment. If we knew a thing to line-go-up about was a good thing to line-go-up about to solve alignment, then just line-go-up about it, and you’ve solved alignment! compare to a hypothetical research field starting three thousand years ago, “machine motion”. machine motion studies making machines cause motion. machine motion papers can push forward “make things go fast”, and eventually, someone figures out how to make machines cause a lot of motion all at once and tries it out in the new mexico desert a few decades ago. but, the sister field, machine aim, has less progress. aiming a machine requires making it go a specific direction, and it turns out that, at least for simple machines, making it go at all is easy to measure, but … the metaphor breaks down because the space one aims through for literal throwing is so low dimensional (and relatively low effective lyapunov exponent) compared to the one we need to aim through (which includes all forms of throwing, as well as every other physical interaction downstream of the starkly superintelligent model we eventually build and align.)
I agree that understanding capabilities is very important for having plausible alignment ideas. I don’t agree that trying to push the frontier of a problem, especially when focused on others’ understanding, is a necessary way to do that. I did a lot of keeping up to date of the kind you’re doing here over the years. but even though doing that is normal and important in order to contribute to alignment seriously, I’ve been very careful not to narrate my thoughts on it online, so as to not be a marginal contribution to the frontier unless I can tell whether I’m improving the ratio of alignment outcomes. if everyone did this, there would be little progress on AI except when it was a good idea to do so. the flip side of this is, I don’t think you’re doing a very good job of understanding capabilities, in a similar way to how most people in the field aren’t; but see above for why that’s all I’ll say on that.
It seems to me that in order to matter, alignment work has to be able to work at the frontier. so working with the frontier is important and not a mistake. but I’m not a fan of anything that pushes that frontier. I want to know how to push it in some directions, but those directions involve figuring out how to make loss functions and learning algorithms that quickly and asymptotically organize an AI into a thing that actually works towards indefinite-term good.
I’m optimistic we can define that, likely many of the tools of capabilities will matter, but I think we’ll want to be on the pretty math-heavy end of capabilities to do it right, where you derive your training setup from a theoretical insight. and I’m optimistic that scaling has already put us close to being able to figure out inhumanly hard math questions with the help of an AI that is only locally aligned to solving problems, and have it help us figure out the math of training a successor that is asymptotically aligned to control the world into states where no other AI breaks humans’ autonomy as we enter this new era.
I think we have different viewpoints of what the frontier is. The majority of the 20% improvements mentioned in this post are things I came up with and are pretty surface level. I have only been looking at LLMs for 6 months when I have free time outside work as something to tinker with, and I don’t consider myself an expert, obviously. I would anticipate that the actual research frontier at labs is substantially ahead, such that any moral discussions around this post are akin to debating if a 11th grade Chemistry lab will encourage the creation of nuclear weapons.
Part of my hope in posting was to get technical feedback from a crowd that is knowledgeable on AI systems. Curious if you can be more specific on why you believe this.
AI development feels more similar to biology than to chemistry. Bright 11th graders shouldn’t be doing experiments on culturing some previously unculturabke pathogen which would be a good bioweapon target and discussing their results, since the field is wide and shallow and it’s not entirely impossible that their experiments are novel. On the other hand, if they’re running basic experiments on culturing some specific common bacterium (e.g. e coli) better, they probably don’t need to worry about accelerating bioweapon development even if there is a chance of them making a slight advancement to the field of biology as a whole.
The nanogpt speedrun feels more like developing better methods to culture e coli at a hobbyist level, and quite unlikely to lead to any substantial advancement applicable to the operational efficiency of well-funded companies at the frontier. Still, it probably is worth keeping track of when the work you’re doing approaches the “this is actually something novel the frontier labs might use” mark, particularly if it’s something more substantial than “here’s how to use the hardware more efficiently to train this particular model”.
Framing isn’t about being covert, it’s about a particular emphasis on what kinds of considerations are in scope, naturally resulting in almost complete omission of obviously irrelevant (technical) details (and occasionally in comical misunderstanding of content produced from a mutually unintelligible framing).