Align­ment Tax

TagLast edit: 22 Dec 2022 5:40 UTC by Multicore

An alignment tax (sometimes called a safety tax) is the additional cost incurred when making an AI aligned, relative to unaligned AI.

Approaches to the alignment tax

Paul Christiano distinguishes two main approaches for dealing with the alignment tax.[1][2] One approach seeks to find ways to pay the tax, such as persuading individual actors to pay it or facilitating coordination of the sort that would allow groups to pay it. The other approach tries to reduce the tax, by differentially advancing existing alignable algorithms or by making existing algorithms more alignable.

Further reading

Copied over from corresponding tag on the EA Forum.

[Linkpost] Jan Leike on three kinds of al­ign­ment taxes

Akash6 Jan 2023 23:57 UTC
27 points
2 comments3 min readLW link

Against ubiquitous al­ign­ment taxes

beren6 Mar 2023 19:50 UTC
52 points
10 comments2 min readLW link

Se­cu­rity Mind­set and the Lo­gis­tic Suc­cess Curve

Eliezer Yudkowsky26 Nov 2017 15:58 UTC
83 points
48 comments20 min readLW link

The com­mer­cial in­cen­tive to in­ten­tion­ally train AI to de­ceive us

Derek M. Jones29 Dec 2022 11:30 UTC
5 points
1 comment4 min readLW link

On the Im­por­tance of Open Sourc­ing Re­ward Models

elandgre2 Jan 2023 19:01 UTC
17 points
5 comments6 min readLW link
No comments.