RSS

Agent Foundations

Tag

Why Agent Foun­da­tions? An Overly Ab­stract Explanation

johnswentworth25 Mar 2022 23:17 UTC
257 points
54 comments8 min readLW link

The Rocket Align­ment Problem

Eliezer Yudkowsky4 Oct 2018 0:38 UTC
199 points
42 comments15 min readLW link2 reviews

Challenges with Break­ing into MIRI-Style Research

Chris_Leong17 Jan 2022 9:23 UTC
74 points
15 comments3 min readLW link

Some AI re­search ar­eas and their rele­vance to ex­is­ten­tial safety

Andrew_Critch19 Nov 2020 3:18 UTC
198 points
40 comments50 min readLW link2 reviews

AXRP Epi­sode 15 - Nat­u­ral Ab­strac­tions with John Wentworth

DanielFilan23 May 2022 5:40 UTC
32 points
1 comment57 min readLW link

[Question] Does agent foun­da­tions cover all fu­ture ML sys­tems?

Jonas Hallgren25 Jul 2022 1:17 UTC
2 points
0 comments1 min readLW link

Un­der­stand­ing In­fra-Bayesi­anism: A Begin­ner-Friendly Video Series

22 Sep 2022 13:25 UTC
136 points
6 comments2 min readLW link

Prize and fast track to al­ign­ment re­search at ALTER

Vanessa Kosoy17 Sep 2022 16:58 UTC
67 points
6 comments3 min readLW link

Clar­ify­ing the Agent-Like Struc­ture Problem

johnswentworth29 Sep 2022 21:28 UTC
54 points
15 comments6 min readLW link

You won’t solve al­ign­ment with­out agent foundations

Mikhail Samin6 Nov 2022 8:07 UTC
21 points
3 comments8 min readLW link

Con­tra “Strong Co­her­ence”

DragonGod4 Mar 2023 20:05 UTC
38 points
24 comments1 min readLW link

Com­po­si­tional lan­guage for hy­pothe­ses about computations

Vanessa Kosoy11 Mar 2023 19:43 UTC
24 points
2 comments12 min readLW link

Fixed points in mor­tal pop­u­la­tion games

ViktoriaMalyasova14 Mar 2023 7:10 UTC
22 points
0 comments12 min readLW link
(www.lesswrong.com)

[Question] Cri­tiques of the Agent Foun­da­tions agenda?

Jsevillamol24 Nov 2020 16:11 UTC
16 points
3 comments1 min readLW link

My take on agent foun­da­tions: for­mal­iz­ing metaphilo­soph­i­cal competence

zhukeepa1 Apr 2018 6:33 UTC
20 points
6 comments1 min readLW link

Another take on agent foun­da­tions: for­mal­iz­ing zero-shot reasoning

zhukeepa1 Jul 2018 6:12 UTC
59 points
20 comments12 min readLW link

Ar­gu­ments about Highly Reli­able Agent De­signs as a Use­ful Path to Ar­tifi­cial In­tel­li­gence Safety

27 Jan 2022 13:13 UTC
27 points
0 comments1 min readLW link
(arxiv.org)

[Question] Choice := An­throp­ics un­cer­tainty? And po­ten­tial im­pli­ca­tions for agency

Antoine de Scorraille21 Apr 2022 16:38 UTC
6 points
1 comment1 min readLW link

Un­der­stand­ing Selec­tion Theorems

adamk28 May 2022 1:49 UTC
41 points
3 comments7 min readLW link

Bridg­ing Ex­pected Utility Max­i­miza­tion and Optimization

Whispermute5 Aug 2022 8:18 UTC
24 points
5 comments14 min readLW link

Dis­cov­er­ing Agents

zac_kenton18 Aug 2022 17:33 UTC
61 points
10 comments6 min readLW link

Re­ward is not Ne­c­es­sary: How to Create a Com­po­si­tional Self-Pre­serv­ing Agent for Life-Long Learning

Roman Leventov12 Jan 2023 16:43 UTC
17 points
2 comments2 min readLW link
(arxiv.org)

Nor­ma­tive vs De­scrip­tive Models of Agency

mattmacdermott2 Feb 2023 20:28 UTC
22 points
5 comments4 min readLW link

A mostly crit­i­cal re­view of in­fra-Bayesianism

matolcsid28 Feb 2023 18:37 UTC
86 points
7 comments28 min readLW link

Perfor­mance guaran­tees in clas­si­cal learn­ing the­ory and in­fra-Bayesianism

matolcsid28 Feb 2023 18:37 UTC
8 points
4 comments31 min readLW link

100 Din­ners And A Work­shop: In­for­ma­tion Preser­va­tion And Goals

Stephen Fowler28 Mar 2023 3:13 UTC
7 points
0 comments7 min readLW link
No comments.