Con­sti­tu­tional AI

TagLast edit: 11 Jul 2023 12:52 UTC by Benaya Koren

Constitutional AI is a method for fine-tuning language models, used in Anthropic’s Claude. The main conceptual difference from RLHF is that instead of human feedback on specific behaviors it relies on the model’s ability to apply general principles (stated in natural language) to specific situations.

Re­view of Align­ment Plan Cri­tiques- De­cem­ber AI-Plans Cri­tique-a-Thon Re­sults

Iknownothing15 Jan 2024 19:37 UTC
24 points
0 comments25 min readLW link

Con­tin­u­ous Ad­ver­sar­ial Qual­ity As­surance: Ex­tend­ing RLHF and Con­sti­tu­tional AI

Benaya Koren8 Jul 2023 17:32 UTC
6 points
0 comments9 min readLW link
No comments.