Non-safety Deepmind seems like among the worst places to work in the world. One of the few companies aiming directly at AGI, and with some of the most substantial history of making progress towards AGI capabilities.
I was thinking of the possibility of affecting decision-making, either directly by rising the ranks (not very likely) or indirectly by being an advocate for safety at an important time and pushing things into the Overton window within an organization.
I imagine Habryka would say that a significant possibility here is that joining an AGI lab will wrongly turn you into an AGI enthusiast. I think biasing effects like that are real, though I also think it’s hard to tell in cases like that how much you are biased v.s. updating correctly on new information, and one could make similar bias claims about the AI x-risk community (e.g. there is social pressure to be doomy; only being exposed to heuristic arguments for doom and few heuristic arguments for optimism will bias you to be doomier than you would be given more information).
Deepmind’s safety team actually seems like a pretty good place to work. They don’t have a history of contributing much to commercialization, and the people working there seem to have quite a bit of freedom in what they choose to work on, while also having access to Deepmind resources.
The biggest risk from working there is just that making the safety team bigger makes more people think that Deepmind’s AI development will be safe, which seems really very far from the truth, but I don’t think this effect is that large.
The biggest risk from working there is just that making the safety team bigger makes more people think that Deepmind’s AI development will be safe, which seems really very far from the truth
It is the capability researchers in particular and their managers and funders that I worry will be lulled into a false sense of security by the presence of the safety team, not onlookers in general. When you make driving safer, e.g., by putting guardrails on a road, or you make driving appear (to the driver) to be safer, drivers react by taking more risks.
Deep mind in general: wdyt?
Non-safety Deepmind seems like among the worst places to work in the world. One of the few companies aiming directly at AGI, and with some of the most substantial history of making progress towards AGI capabilities.
It seems like you are confident that the delta in capabilites would outweigh any delta in general alignment sympathy. Is this what you think?
May I ask what you are calling “general alignment sympathy”? Could you say it in other words or give some examples?
I was thinking of the possibility of affecting decision-making, either directly by rising the ranks (not very likely) or indirectly by being an advocate for safety at an important time and pushing things into the Overton window within an organization.
I imagine Habryka would say that a significant possibility here is that joining an AGI lab will wrongly turn you into an AGI enthusiast. I think biasing effects like that are real, though I also think it’s hard to tell in cases like that how much you are biased v.s. updating correctly on new information, and one could make similar bias claims about the AI x-risk community (e.g. there is social pressure to be doomy; only being exposed to heuristic arguments for doom and few heuristic arguments for optimism will bias you to be doomier than you would be given more information).
Deep Mind’s safety team specifically: wdyt?
Deepmind’s safety team actually seems like a pretty good place to work. They don’t have a history of contributing much to commercialization, and the people working there seem to have quite a bit of freedom in what they choose to work on, while also having access to Deepmind resources.
The biggest risk from working there is just that making the safety team bigger makes more people think that Deepmind’s AI development will be safe, which seems really very far from the truth, but I don’t think this effect is that large.
It is the capability researchers in particular and their managers and funders that I worry will be lulled into a false sense of security by the presence of the safety team, not onlookers in general. When you make driving safer, e.g., by putting guardrails on a road, or you make driving appear (to the driver) to be safer, drivers react by taking more risks.