It depends on what you’re trying to do, right? Like, if you build a great eval to detect agent autonomy, but nobody adopts it, you haven’t accomplished anything. You need to know how to work with AI labs. In that case, selling widgets (your eval) is highly aligned with AI safety.
IME there are an extremely large number of NGOs with passionate people who do not remotely move the needle on whatever problem they are trying to solve. I think it’s the modal outcome for a new nonprofit. I’m not 100% sure that the feedback loop aspect is the reason but I think it plays a very substantial role.
Fair enough. There are some for profits where profit and impact are more related than others.
But it’s also quite likely your evals are not actually evaluating anything to do with x-risks or s-risks, and so it just feels like it’s making progress, but isn’t.
I’m assuming here people are trying to prevent AI from killing everyone. If you have other goals, this doesn’t apply.
there are an extremely large number of NGOs with passionate people who do not remotely move the needle on whatever problem they are trying to solve. I think it’s the modal outcome for a new nonprofit
I’d say this is the same thing for AI for-profits from the perspective of AI notkilleveryoneism. Probably, the modal outcome is slightly increasing the odds AI kills everyone. At least the non-profits the modal outcome is not doing anything, rather than making things worse.
Sure, but you get feedback for whether it helps customers with their immediate problems. You don’t get feedback on whether it helps with AI safety.
It’s the direction vs speed thing again. You’ll get good at building widgets that sell. You won’t get good at AI notkilleveryoneism.
It depends on what you’re trying to do, right? Like, if you build a great eval to detect agent autonomy, but nobody adopts it, you haven’t accomplished anything. You need to know how to work with AI labs. In that case, selling widgets (your eval) is highly aligned with AI safety.
IME there are an extremely large number of NGOs with passionate people who do not remotely move the needle on whatever problem they are trying to solve. I think it’s the modal outcome for a new nonprofit. I’m not 100% sure that the feedback loop aspect is the reason but I think it plays a very substantial role.
Fair enough. There are some for profits where profit and impact are more related than others.
But it’s also quite likely your evals are not actually evaluating anything to do with x-risks or s-risks, and so it just feels like it’s making progress, but isn’t.
I’m assuming here people are trying to prevent AI from killing everyone. If you have other goals, this doesn’t apply.
I’d say this is the same thing for AI for-profits from the perspective of AI notkilleveryoneism. Probably, the modal outcome is slightly increasing the odds AI kills everyone. At least the non-profits the modal outcome is not doing anything, rather than making things worse.