Seems possibly true. More generally an important underexplored tool seems to be something like shaping value exploration and self perception during RL. Here are some thoughts: Shaping Value Exploration During RL Training
Several persons are interested in working on that. Let’s coordinate if you are planning to research this.
I appreciate the invitation. I am very interested in persona research, but I hadn’t intended to research this specific application of it: I simply proposed it, in the hope someone (most likely at a foundation lab) might pick it up (if they’re not already doing so). However, if someone else was taking this on, then I’d be interested in being involved.
Thanks for the link to your doc: it’s thought-provoking and closely related, and I have added some comments. Feel free to shift this to PMs — I am also on the AI Alignment, MATS, and Meridian Slacks.
Seems possibly true. More generally an important underexplored tool seems to be something like shaping value exploration and self perception during RL. Here are some thoughts: Shaping Value Exploration During RL Training
Several persons are interested in working on that. Let’s coordinate if you are planning to research this.
I appreciate the invitation. I am very interested in persona research, but I hadn’t intended to research this specific application of it: I simply proposed it, in the hope someone (most likely at a foundation lab) might pick it up (if they’re not already doing so). However, if someone else was taking this on, then I’d be interested in being involved.
Thanks for the link to your doc: it’s thought-provoking and closely related, and I have added some comments. Feel free to shift this to PMs — I am also on the AI Alignment, MATS, and Meridian Slacks.