A Case for AI Safety via Law
This post is to make the subject case more available and open to comment. A paper with the above title is currently languishing in arXiv limbo but available in Google Docs. I was surprised to see my last and only postings to LessWrong were made on this very subject in 2010 as comments to the thread “Why not just write failsafe rules into the superintelligent machine?”
Unfortunately (IMO) this approach to AI alignment doesn’t seem to have gained much traction in the past 13 years. The paper does cite, however, 15 precedents that show there is some support. (Anthropic’s Constitutional AI and John Nay’s Law Informs Code are two good recent examples.)
The following reproduces the Summary of Argument and Conclusion sections from the paper.
The claim being argued is “Effective legal systems are the best way to address AI safety.”
4 Summary of Argument
4.1 Law is the standard, time-tested, best practice for maintaining
order in societies of intelligent agents.
Law has been the primary way of maintaining functional, cohesive societies for thousands of years. It is how humans establish, communicate, and understand what actions are required, permissible, and prohibited in social spheres. Substantial experience exists in drafting, enacting, enforcing, litigating, and maintaining rules in contexts that include public law, private contracts, and the many others noted in this brief. Law will naturally apply to new species of intelligent systems and facilitate safety and value alignment for all.
4.2 Law is scrutable to humans and other intelligent agents.
Unlike AI safety proposals where rules are learned via examples and encoded in artificial (or biological) neural networks, laws are intended to be understood by humans and machines. Although laws can be quite complex, such codified rules are significantly more scrutable than rules learned through induction. The transparent (white box) nature of law provides a critical advantage over opaque (black box) neural network alternatives.
4.3 Law reflects consensus values.
Democratically developed law is intimately linked and essentially equivalent to consensus ethics. Both are human inventions intended to facilitate the wellbeing of individuals and the collective. They represent shared values culturally determined through rational consideration and negotiation. They reflect the wisdom of crowds accumulated over time—not preferences that vary from person to person and are often based on emotion, irrational ideologies, confusion, or psychopathy. Ethical values provide the virtue core of legal systems and reflect the “spirit of the law.” Consequentialist shells surround such cores and specify the “letter of the law.” This relationship between law and ethics makes law a natural solution for human-AI value alignment. A minority of AIs and people, however powerful, cannot game laws to achieve selfish ends.
4.4 Legal systems are responsive to changes in the environment and changes in moral values.
By utilizing legal mechanisms to consolidate values and update them over time, human and AI values can remain aligned indefinitely as values, technologies, and environmental conditions change. Thus law provides a practical implementation of Yudkowsky’s (2004) Coherent Extrapolated Volition by allowing values to evolve that are wise, aspirational, convergent, coherent, suitably extrapolated, and properly interpreted.
4.5 Legal systems restrict overly rapid change.
Legal processes provide checks and balances against overly rapid change to values and laws. Such checks are particularly important when legal change can occur at AI speeds. Legal systems and laws must adapt quickly enough to address the urgency of issues that arise but not so quickly as to risk dire consequences. Laws should be based on careful analysis and effective simulation and the system be able to quickly detect and correct problems found after implementation. New technologies and methods should be introduced to make legal processing as efficient as possible without removing critical checks and balances.
4.6 Laws are context sensitive, hierarchical, and scalable.
Laws apply to contexts ranging from international, national, state, and local governance to all manner of other social contracts. Contexts can overlap, be hierarchical, or have other relationships. Humans have lived under this regime for millennia and are able to understand which laws apply and take precedence over others based on contexts (e.g., jurisdictions, organization affiliations, contracts in force). Artificial intelligent systems will be able to manage the multitude of contexts and applicable laws by identifying, loading, and applying appropriate legal corpora for applicable contexts. For example, AIs (like humans) will understand that crosschecking is permitted in hockey games but not outside the arena. They will know when to apply rules of the road versus rules of the sea. They will know when the laws of chess apply versus rules of Go. They will know their rights relative to every software agent, tool, and service they interface with.
4.7 AI Safety via Law can address the full range of AI safety risks, from systems that are narrowly focused to those having general intelligence or even superintelligence.
Enacting and enforcing appropriate laws, and instilling law-abiding values in AIs and humans, can mitigate risks spanning all levels of AI capability—from narrow AI to AGI and ASI. If intelligent agents stray from the law, effective detection and enforcement must occur.
Even the catastrophic vision of smarter-than-human-intelligence articulated by Yudkowsky (2022, 2023) and others (Bostrom, 2014; Russell, 2019) can be avoided by effective implementation of AISVL. It may require that the strongest version of the instrumental convergence thesis (which they rely on) is not correct. Appendix A suggests some reasons why AI convergence to dangerous values is not inevitable.
AISVL applies to all intelligent systems regardless of their underlying design, cognitive architecture, and technology. It is immaterial whether an AI is implemented using biology, deep learning, constructivist AI (Johnston, 2023), semantic networks, quantum computers, positronics, or other methods. All intelligent systems must comply with applicable laws regardless of their particular values, preferences, beliefs, and how they are wired.
Although its practice has often been flawed, law is a natural solution for maintaining social safety and value alignment. All intelligent agents— biological and mechanical—must know the law, strive to abide by it, and be subject to effective intervention when violated. The essential equivalence and intimate link between consensus ethics and democratic law provide a philosophical and practical basis for legal systems that marry values and norms (“virtue cores”) with rules that address real world situations (“consequentialist shells”). In contrast to other AI safety proposals, AISVL requires AIs “do as we legislate, not as we do.”
Advantages of AISVL include its leveraging of time-tested standard practice; scrutability to all intelligent agents; reflection of consensus values; responsiveness to changes in the environment and in moral values; restrictiveness of overly rapid change; context sensitivity, hierarchical structure, and scalability; and applicability to safety risks posed by narrow, general, and even superintelligent AIs.
For the future safety and wellbeing of all sentient systems, work should occur in earnest to improve legal processes and laws so they are more robust, fair, nimble, efficient, consistent, understandable, accepted, and complied with. (Legal frameworks outside of public law may be effective to this end.) Humans are in dire need of such improvements to counter the dangers that we pose to the biosphere and to each other. It is not clear if advanced AI will be more or less dangerous than humans. Law is critical for both.