AI as­sisted Alignment

TagLast edit: 14 Mar 2023 21:20 UTC by Raemon

AI assisted Alignment is a cluster of alignment plans involves AI somehow significantly helping with alignment research. This can include weak tool AI, or more advanced AGI doing original research.

There has been a lot of debate about how practical this alignment approach is.

[Link] A min­i­mal vi­able product for alignment

janleike6 Apr 2022 15:38 UTC
51 points
38 comments1 min readLW link

[Link] Why I’m op­ti­mistic about OpenAI’s al­ign­ment approach

janleike5 Dec 2022 22:51 UTC
96 points
13 comments1 min readLW link

“Care­fully Boot­strapped Align­ment” is or­ga­ni­za­tion­ally hard

Raemon17 Mar 2023 18:00 UTC
199 points
9 comments11 min readLW link

Why Not Just… Build Weak AI Tools For AI Align­ment Re­search?

johnswentworth5 Mar 2023 0:12 UTC
139 points
17 comments6 min readLW link

Why Not Just Out­source Align­ment Re­search To An AI?

johnswentworth9 Mar 2023 21:49 UTC
112 points
45 comments9 min readLW link

Dis­cus­sion with Nate Soares on a key al­ign­ment difficulty

HoldenKarnofsky13 Mar 2023 21:20 UTC
203 points
25 comments22 min readLW link

Align­ment with ar­gu­ment-net­works and as­sess­ment-predictions

Tor Økland Barstad13 Dec 2022 2:17 UTC
7 points
5 comments45 min readLW link
