Pournelle’s Iron Law of Bureaucracy states that in any bureaucratic organization there will be two kinds of people:
First, there will be those who are devoted to the goals of the organization. Examples are dedicated classroom teachers in an educational bureaucracy, many of the engineers and launch technicians and scientists at NASA, even some agricultural scientists and advisors in the former Soviet Union collective farming administration.
Secondly, there will be those dedicated to the organization itself. Examples are many of the administrators in the education system, many professors of education, many teachers union officials, much of the NASA headquarters staff, etc.
The Iron Law states that in every case the second group will gain and keep control of the organization. It will write the rules, and control promotions within the organization.
I’m concerned that this “law” may apply to Anthropic. People devoted to Anthropic as an organization will have more power than people devoted to the goal of creating aligned AI.
Next, devise a collective decision-making procedure for activating that contingency plan. For example, maybe the contingency plan should be activated if X% of the technical staff votes to activate it. Perhaps after having a week of discussion first? What would be the trigger to spend a week doing discussion? You can answer these questions and come up with a formal procedure.
If you had both a contingency plan and a formal means to activate it, I would feel a lot better about Anthropic as an organization.
Source.
I’m concerned that this “law” may apply to Anthropic. People devoted to Anthropic as an organization will have more power than people devoted to the goal of creating aligned AI.
I would encourage people at Anthropic to leave a line of retreat and consider the “least convenient possible world” where alignment is too hard. What’s the contingency plan for Anthropic in that scenario?
Next, devise a collective decision-making procedure for activating that contingency plan. For example, maybe the contingency plan should be activated if X% of the technical staff votes to activate it. Perhaps after having a week of discussion first? What would be the trigger to spend a week doing discussion? You can answer these questions and come up with a formal procedure.
If you had both a contingency plan and a formal means to activate it, I would feel a lot better about Anthropic as an organization.