I take this comment as evidence that John would fail an intellectual turing test for people who have different views than he does about how valuable incremental empiricism is.
I don’t want to pour a ton of effort into this, but here’s my 5-paragraph ITT attempt.
“As an analogy for alignment, consider processor manufacturing. We didn’t get to gigahertz clock speed and ten nanometer feature size by trying to tackle all the problems of 10 nm manufacturing processes right out the gate. That would never have worked; too many things independently go wrong to isolate and solve them all without iteration. We can’t get many useful bits out of empirical feedback if the result is always failure, and always for a long list of reasons.
And of course, if you know anything about modern fabs, you know there’d have been no hope whatsoever of identifying all the key problems in advance just based on theory. (Side note: I remember a good post or thread from the past year on crazy shit fabs need to do, but can’t find it; anyone remember that and have a link?)
The way we actually did it was to start with gigantic millimeter-size features, which were relatively easy to manufacture. And then we scaled down slowly. At each new size, new problems came up, but those problems came up just a few at a time as we only scaled down a little bit at each step. We could carry over most of our insights from earlier stages, and isolate new problems empirically.
The analogy, in AI, is to slowly ramp up the capabilities/size/optimization pressure of our systems. Start with low capability, and use whatever simple tricks will help in that regime. Then slowly ramp up, see what new problems come up at each stage, just like we did for chip manufacturing. And to complete the analogy: just like with chips, at each step we can use the products of the previous step to help design the next step.
That’s the sort of plan which has a track record of actually handling the messiness of reality, even when scaling things over many orders of magnitude.”
There, let me know how plausible that was as an ITT attempt for “people who have different views [than I do] about how valuable incremental empiricism is”.
Forgot to reply to this at the time, but I think this is a pretty good ITT. (I think there’s probably some additional argument that people would make about why this isn’t just an isolated analogy, but rather a more generally-applicable argument, but it does seem to be a fairly central example of that generally-applicable argument.)
I think people who value empirical alignment work now probably think that (to some extent) we can predict at a high level what future problems we might face (contrasting with “there’d have been no hope whatsoever of identifying all the key problems in advance just based on theory”). Obviously this is a spectrum, but I think the chip fab analogy is I think further towards people believing there are unknown unknowns in the problem space than people at OpenAI are (e.g. OpenAI people possibly think outer alignment and inner alignment capture all of the kinds of problems we’ll face).
However, they probably don’t believe you can work on solutions to those problems without being able to empirically demonstrate those problems and hence iterate on them (and again one could probably appeal to a track record here of most proposed solutions to problems not working unless they were developed by iterating on the actual problem). We can maybe vaguely postulate what the solutions could look like (they would say), but it’s going to be much better to try and actually implement solutions on versions of the problem we can demonstrate, and iterate from there. (Note that they probably also perhaps try and produce demonstrations of the problems such that they can then work on those solutions, but this is still all empirical).
Otherwise I do think your ITT does seem reasonable to me, although I don’t think I’d put myself in the class of people you’re trying to ITT, so that’s not much evidence.
I don’t want to pour a ton of effort into this, but here’s my 5-paragraph ITT attempt.
“As an analogy for alignment, consider processor manufacturing. We didn’t get to gigahertz clock speed and ten nanometer feature size by trying to tackle all the problems of 10 nm manufacturing processes right out the gate. That would never have worked; too many things independently go wrong to isolate and solve them all without iteration. We can’t get many useful bits out of empirical feedback if the result is always failure, and always for a long list of reasons.
And of course, if you know anything about modern fabs, you know there’d have been no hope whatsoever of identifying all the key problems in advance just based on theory. (Side note: I remember a good post or thread from the past year on crazy shit fabs need to do, but can’t find it; anyone remember that and have a link?)
The way we actually did it was to start with gigantic millimeter-size features, which were relatively easy to manufacture. And then we scaled down slowly. At each new size, new problems came up, but those problems came up just a few at a time as we only scaled down a little bit at each step. We could carry over most of our insights from earlier stages, and isolate new problems empirically.
The analogy, in AI, is to slowly ramp up the capabilities/size/optimization pressure of our systems. Start with low capability, and use whatever simple tricks will help in that regime. Then slowly ramp up, see what new problems come up at each stage, just like we did for chip manufacturing. And to complete the analogy: just like with chips, at each step we can use the products of the previous step to help design the next step.
That’s the sort of plan which has a track record of actually handling the messiness of reality, even when scaling things over many orders of magnitude.”
There, let me know how plausible that was as an ITT attempt for “people who have different views [than I do] about how valuable incremental empiricism is”.
Forgot to reply to this at the time, but I think this is a pretty good ITT. (I think there’s probably some additional argument that people would make about why this isn’t just an isolated analogy, but rather a more generally-applicable argument, but it does seem to be a fairly central example of that generally-applicable argument.)
I think people who value empirical alignment work now probably think that (to some extent) we can predict at a high level what future problems we might face (contrasting with “there’d have been no hope whatsoever of identifying all the key problems in advance just based on theory”). Obviously this is a spectrum, but I think the chip fab analogy is I think further towards people believing there are unknown unknowns in the problem space than people at OpenAI are (e.g. OpenAI people possibly think outer alignment and inner alignment capture all of the kinds of problems we’ll face).
However, they probably don’t believe you can work on solutions to those problems without being able to empirically demonstrate those problems and hence iterate on them (and again one could probably appeal to a track record here of most proposed solutions to problems not working unless they were developed by iterating on the actual problem). We can maybe vaguely postulate what the solutions could look like (they would say), but it’s going to be much better to try and actually implement solutions on versions of the problem we can demonstrate, and iterate from there. (Note that they probably also perhaps try and produce demonstrations of the problems such that they can then work on those solutions, but this is still all empirical).
Otherwise I do think your ITT does seem reasonable to me, although I don’t think I’d put myself in the class of people you’re trying to ITT, so that’s not much evidence.