Joseph Bloom comments on Proposal: Using Monte Carlo tree search instead of RLHF for alignment research

Joseph Bloom 30 Apr 2023 3:02 UTC
3 points
0
I don’t buy that argument at all. “text space” seems to have been adequate to get to GPT3 which is incredibly impressive and useful in a variety of ways. Furthermore, what proof do you have that resulting insights wouldn’t transfer to multi-modal systems like GPT4 (which can see) or Palm-E which is embodied and can see and operate in “text space”. Moreover, I’m not the first to point out that text space seems to incentivize models develop highly sophisticated thinking abilities which seem like the more important thing to focus on.
- Charlie Steiner 30 Apr 2023 7:21 UTC
  4 points
  2
  Parent
  You seem to be making a very general cloud of claims about the impressiveness of transformers. I was making a very specific claim about the system described in the post, and in what sense it’s not myopic.