My response to Xuan would be that I don’t expect us to ‘just add data and compute’ in the next seven years, or four years. I expect us to do many other things as well. If you are doing the thought experiment expecting no such progress, you are doing the wrong thought experiment. Also my understanding is we already have a pattern of transformers looking unable to do something, then we scale and suddenly that changes for reasons we don’t fully understand.
I copied Xuan’s twitter thread into the comments section of the LessWrong post of Jacob Steinhardt’s GPT 2030 predictions. In defense of Xuan, she also says that she does not expect us to ‘just add data and compute’. She said that she thinks Jacob’s predictions are unlikely seeming iff you assume we will do nothing but add data and compute, but that she thinks this is unlikely. Thus, she is in agreement with you about criticizing Jacob’s assumption of ‘only data and compute added’.
Where you seem to differ from her point of view is that you think that additional data and compute only could indeed lead to novel and surprising emergent capabilities. She seems to think we’ve found about all there is to find there.
I, in agreement with you, believe that there are more novel emergent capabilities to be found through only adding data and compute. I do however think that we have reached a regime of diminishing returns. I believe that much greater efficiency in making forward progress will be found through algorithmic progress, and so the fact that technically more data and compute alone would be sufficient will become irrelevant since that will be exorbitantly expensive compared to discovering and utilizing algorithmic improvements.
It is my understanding that this is also what you think, and perhaps also what Xuan thinks but I haven’t read enough from her to know that. I’ve only read Xuan’s one twitter thread I copied.
<rant> This is a prime example of why I dislike Twitter so much as a medium of debate. The fractured tree of threads and replies makes it too hard to see a cohesive discussion between people and get a comprehensive understanding of other’s viewpoints. The nuance gets lost. Had this discussion taken place in a comment thread in a forum, such as this one, then it would have been much easier to tell when you’d read Xuan’s entire comment thread and gotten her full viewpoint. </rant>
I hear that rant. I do my best now to reconcile thing into a cohesive whole in text but that means the timestamps are gone, so if I miss something later I can’t tell.
I am skeptical that we’re near the end of the useful road on data/compute, although I agree that in the 2030 timeframe that’s not where the low hanging fruit is mostly going to be. My prediction of ‘this 2030 arrives by 2027’ is based largely on other ways of improving.
To extent compute/data help I think of it more as helping to enable other things.
I copied Xuan’s twitter thread into the comments section of the LessWrong post of Jacob Steinhardt’s GPT 2030 predictions. In defense of Xuan, she also says that she does not expect us to ‘just add data and compute’. She said that she thinks Jacob’s predictions are unlikely seeming iff you assume we will do nothing but add data and compute, but that she thinks this is unlikely. Thus, she is in agreement with you about criticizing Jacob’s assumption of ‘only data and compute added’.
Where you seem to differ from her point of view is that you think that additional data and compute only could indeed lead to novel and surprising emergent capabilities. She seems to think we’ve found about all there is to find there.
I, in agreement with you, believe that there are more novel emergent capabilities to be found through only adding data and compute. I do however think that we have reached a regime of diminishing returns. I believe that much greater efficiency in making forward progress will be found through algorithmic progress, and so the fact that technically more data and compute alone would be sufficient will become irrelevant since that will be exorbitantly expensive compared to discovering and utilizing algorithmic improvements.
It is my understanding that this is also what you think, and perhaps also what Xuan thinks but I haven’t read enough from her to know that. I’ve only read Xuan’s one twitter thread I copied.
<rant> This is a prime example of why I dislike Twitter so much as a medium of debate. The fractured tree of threads and replies makes it too hard to see a cohesive discussion between people and get a comprehensive understanding of other’s viewpoints. The nuance gets lost. Had this discussion taken place in a comment thread in a forum, such as this one, then it would have been much easier to tell when you’d read Xuan’s entire comment thread and gotten her full viewpoint. </rant>
I hear that rant. I do my best now to reconcile thing into a cohesive whole in text but that means the timestamps are gone, so if I miss something later I can’t tell.
I am skeptical that we’re near the end of the useful road on data/compute, although I agree that in the 2030 timeframe that’s not where the low hanging fruit is mostly going to be. My prediction of ‘this 2030 arrives by 2027’ is based largely on other ways of improving.
To extent compute/data help I think of it more as helping to enable other things.
Yes, in particular, more compute means it’s easier to automate searches for algorithmic improvements....