A general point is that going from “no human cares at all” to “a small group of people with limited resources cares” might be a big difference, especially given the potential leverage of using a bunch of AI labor and importing cheap measures developed elsewhere.
In D-labs, both the safety faction and the non-safety faction are leveraging AI labour.
AI labour makes D-labs seem more like C-labs and less like E-labs, directionally.
This is because the effectiveness ratio between (10 humans) and (990 humans) is greater than the ratio between (10 humans and 1M AIs) and (990 humans and 990M AIs).
This is because of diminishing returns to cognitive labour, i.e. cheap interventions.
(Yes, also I think that a small number of employees working on safety might get proportionally more compute than the average company employee, e.g. this currently seems to be the case.)
Yeah, that partly makes sense to me. I guess my intuition is like, if 95% of the company is focused on racing as hard as possible (and using AI leverage for that too, AI coming up with new unsafe tricks and all that), then the 5% who care about safety probably won’t have that much impact.
A general point is that going from “no human cares at all” to “a small group of people with limited resources cares” might be a big difference, especially given the potential leverage of using a bunch of AI labor and importing cheap measures developed elsewhere.
To clarify what I think is Ryan’s point:
In D-labs, both the safety faction and the non-safety faction are leveraging AI labour.
AI labour makes D-labs seem more like C-labs and less like E-labs, directionally.
This is because the effectiveness ratio between (10 humans) and (990 humans) is greater than the ratio between (10 humans and 1M AIs) and (990 humans and 990M AIs).
This is because of diminishing returns to cognitive labour, i.e. cheap interventions.
(Yes, also I think that a small number of employees working on safety might get proportionally more compute than the average company employee, e.g. this currently seems to be the case.)
Yeah, that partly makes sense to me. I guess my intuition is like, if 95% of the company is focused on racing as hard as possible (and using AI leverage for that too, AI coming up with new unsafe tricks and all that), then the 5% who care about safety probably won’t have that much impact.