ryan_greenblatt comments on Does reducing the amount of RL for a given capability level make AI safer?

ryan_greenblatt 7 May 2024 17:33 UTC
5 points
0
In brief: large amounts of high quality process based RL might result in AI being more useful earlier (prior to them becoming much smarter). This might be expensive and annoying (e.g. it might require huge amounts of high quality human labor) such that by default labs do less of this relative to just scaling up models than would be optimal from a safety perspective.