Update on Ought’s experiments on factored evaluation of arguments

Link post

Ought has writ­ten a de­tailed up­date and anal­y­sis of re­cent ex­per­i­ments on fac­tored cog­ni­tion. Th­ese are ex­per­i­ments with hu­man par­ti­ci­pants and don’t in­volve any ma­chine learn­ing. The goal is to learn about the vi­a­bil­ity of IDA, De­bate, and re­lated ap­proaches to AI al­ign­ment. For back­ground, here are some prior LW posts on Ought: Ought: Why it Mat­ters and How to Help, Fac­tored Cog­ni­tion pre­sen­ta­tion.

Here is the open­ing of the re­search up­date:

Eval­u­at­ing Ar­gu­ments One Step at a Time
We’re study­ing fac­tored cog­ni­tion: un­der what con­di­tions can a group of peo­ple ac­com­plish com­plex cog­ni­tive tasks if each per­son only has min­i­mal con­text?
In a re­cent ex­per­i­ment, we fo­cused on di­vid­ing up the task of eval­u­at­ing ar­gu­ments. We cre­ated short, struc­tured ar­gu­ments for claims about movie re­views. We then tried to dis­t­in­guish valid from in­valid ar­gu­ments by show­ing each par­ti­ci­pant only one step of the ar­gu­ment, not the re­view or the other steps.
In this ex­per­i­ment, we found that:
1. Fac­tored eval­u­a­tion of ar­gu­ments can dis­t­in­guish some valid from in­valid ar­gu­ments by iden­ti­fy­ing im­plau­si­ble steps in ar­gu­ments for false claims.
2. How­ever, ex­per­i­ment par­ti­ci­pants dis­agreed a lot about whether steps were valid or in­valid. This method is there­fore brit­tle in its cur­rent form, even for ar­gu­ments which only have 1–5 steps.
3. More di­verse ar­gu­ment and ev­i­dence types (be­sides di­rect quotes from the text), larger trees, and differ­ent par­ti­ci­pant guidelines should im­prove re­sults.
In this tech­ni­cal progress up­date, we de­scribe these find­ings in depth.

The rest of the post is here.

No comments.