I think that in current political climate, the key to mitigate race dynamics hinges on cross-lab collaboration (safety collaboration being the goal).
To do this, you need to create as many interaction points as possible, direct and indirect, between the labs. Then steer towards safety collaboration being beneficial. Use general incentives as levers, tie them to safety purposes.
Think of various intracellular pathways that initially evolve independently, with various outcomes that compete or support each other. Because their components may interact, and because their outcomes affect each other, they end up regulating each other. Over time the signal network optimizes to a shared outcome.
Note: You can only steer if you are also involved in or enable these interactions. “You” could be a gov., a UN function, a safety group, etc.
Strategy: Create more and better opportunities for cross-lab safety work.
This already happens in the open with the publishing of papers. Platforms promoting safety papers do this.
Guest aligners(!) and shared alignment standards would be two other ways. Sharing anonymized safety data between labs, a third. I suspect that the latter one is most doable atm.
Example: As a third party (TP), build and enable a safety data submission mechanism, that integrates into a shared safety standard.
>> Each lab can submit standardized data to TP, and in return they get aggregated, anonymous and standardized safety data. A joint report. They see overall risk levels rise and drop across the group over time. They may see incidemt frequencies. They may see data on collective threats and bottlenecks.
This data is incredibly valuable to all players.
They can now make better predictions and understand timelines better. They can also agree on (under TP mediation) set danger levels, indicating that they should slow down and/or release more data, etc. This enables coordination without direct collaborarion.
My interest in contributing to AI safety is working on strategic coordination problems, with racing dynamics being the main challenge right now.
I work daily with international stakeholder management and coordination inside a major big corp. I have to always consider local legislation & culture, company policy, and real corporate politics. (Oh yes, and the business strategy!) This provides valuable insight into how coordination bottlenecks can arise and how they can be overcome.
I think that in current political climate, the key to mitigate race dynamics hinges on cross-lab collaboration (safety collaboration being the goal).
To do this, you need to create as many interaction points as possible, direct and indirect, between the labs. Then steer towards safety collaboration being beneficial. Use general incentives as levers, tie them to safety purposes.
Think of various intracellular pathways that initially evolve independently, with various outcomes that compete or support each other. Because their components may interact, and because their outcomes affect each other, they end up regulating each other. Over time the signal network optimizes to a shared outcome.
Note: You can only steer if you are also involved in or enable these interactions. “You” could be a gov., a UN function, a safety group, etc.
Strategy: Create more and better opportunities for cross-lab safety work.
This already happens in the open with the publishing of papers. Platforms promoting safety papers do this.
Guest aligners(!) and shared alignment standards would be two other ways. Sharing anonymized safety data between labs, a third. I suspect that the latter one is most doable atm.
Example: As a third party (TP), build and enable a safety data submission mechanism, that integrates into a shared safety standard.
>> Each lab can submit standardized data to TP, and in return they get aggregated, anonymous and standardized safety data. A joint report. They see overall risk levels rise and drop across the group over time. They may see incidemt frequencies. They may see data on collective threats and bottlenecks.
This data is incredibly valuable to all players.
They can now make better predictions and understand timelines better. They can also agree on (under TP mediation) set danger levels, indicating that they should slow down and/or release more data, etc. This enables coordination without direct collaborarion.
My interest in contributing to AI safety is working on strategic coordination problems, with racing dynamics being the main challenge right now.
I work daily with international stakeholder management and coordination inside a major big corp. I have to always consider local legislation & culture, company policy, and real corporate politics. (Oh yes, and the business strategy!) This provides valuable insight into how coordination bottlenecks can arise and how they can be overcome.