Where does the money go? Is it being sold at cost, or is there surplus?
If money is being made, will it support: 1. The authors? 2. LW hosting costs? 3. LW-adjacent charities like MIRI? 4. The editors/compilers/LW moderators?
EDIT: Was answered over on /r/slatestarcodex. tldr: one print run has been paid for at a loss, any (unlikely) profits go to supporting the Lesswrong nonprofit organization.
So.… they held the door open to see if it’d escape or not? I predict this testing method may go poorly with more capable models, to put it lightly.
And then OpenAI deployed a more capable version than was tested!
This defeats the entire point of testing.
I am slightly worried that posts like veedrac’s Optimality is the Tiger may have given them ideas. “Hey, if you run it in this specific way, a LLM might become an agent! If it gives you code for recursively calling itself, don’t run it”… so they write that code themselves and run it.
I really don’t know how to feel about this. On one hand, this is taking ideas around alignment seriously and testing for them, right? On the other hand, I wonder what the testers would have done if the answer was “yep, it’s dangerously spreading and increasing it’s capabilities oh wait no nevermind it’s stopped that and looks fine now”.