Galathir comments on Galathir’s Shortform

Galathir 9 Oct 2025 9:13 UTC
1 point
0
This is a letter I’m thinking about sending to my MP (hence the UK specific things). I would be interested in other peoples take on the problem.

The UK government’s creation of the AI Safety Institute as well its ambition to ‘turbocharge’ AI presents a challenge. To unlock the immense promise of AI without causing dangerous AI arms races, we need to establish international coordination, codes of conduct and cooperative frameworks appropriate for AI. This will require the UK leading on an international stage, having tried things out on a local level.
Why these safety measures around the development of AI have not been established already, is an open problem that I have not seen studied. My current hypothesis might be that discussion and policy around AI and the future is dominated by lesswrong style rationalism and neorealism. These philosophies suggest that building international mechanisms for cooperation on AI development is impossible and not worth trying and that things like codes of conduct for AI engineers are naive.

If there is a domination, it might be due to a founder effect as rationalism and philosophers like Nick Bostrom have been influential in the creation of the field of AI safety and AI based existential risk. They look at AI from an economically rational point of view. Rationalists also control the large online forum for discussing AI safety and policy, lesswrong and I’m not aware of others.
If there is a domination by cynical rationalism it is a problem for two reasons.
The first is that it has not been talked about or researched, so we don’t know the scope of the problem. The negative impact it has had on the AI safety progress or policy. It might have caused the lack of progress in UN or other international coordination on AI safety.

Secondly it can lead to problems if humans (and presumably AI) adopt the cynical view point. Historically it was this cynical philosophy that led to arms races in nuclear weapons and made the world less safe due to the security dilemma. The security dilemma is where states are trapped to make things less secure because there is supposed anarchy on the international level. Applying this to AI means a potentially dangerous AI arms race as there will not be time to build in the necessary safeguards into AI.

The lack of trust and lack of trust in the ability to build trust is corrosive to life and intelligence.
So what can be done? To start with this argument suggests researching the AI safety community and seeing how broad it is philosophically. This can then feed into the next step. If this study finds a narrow philosophical base, broadening the research directions might be the way to go. Perhaps by funding people doing traditional PhDs, or with research bodies like ARIA. They would then research codes of conduct for engineers and attempt to embed and test values and philosophies inside AI. The philosophies might Include care-based philosophies, deontology and virtue based.

So instead of just the field of AI alignment, where there is the hanging question of who to align the AI with, there could be movement towards trying different philosophies embedded in design and seeing which ones work best for humanity empirically. If actors see that each other are working towards non-cynically rational AI in a way that is not cynically rational, then the pressure to speed up the arms race becomes less.

Testing different philosophies in the real world could be achieved by doing user centred design of regulatory mandated tests for broadly deployed AI. The tests would have to be passed before the AI could be released. For example the AI agents could be simulated in an environment with humans and how the goals given to them interacted with simulated humans goals could be observed. Then the AI trained to pass this test would interact with a diverse set of humans to get feedback on how well the test makes the AI useful and non-harmful. This approach to test design could be tried out on a national or local level, before rolling out internationally.

The UK has shown an ambition to lead the way on AI safety and policy. This could be a way for the UK government to realise that ambition.