RSS

Rubi J. Hudson

Karma: 805

Defin­ing Mon­i­torable and Use­ful Goals

Rubi J. Hudson15 Jul 2025 23:06 UTC
10 points
0 comments16 min readLW link

Defin­ing Cor­rigible and Use­ful Goals

Rubi J. Hudson25 Jun 2025 3:51 UTC
33 points
2 comments24 min readLW link

Safe Pre­dic­tive Agents with Joint Scor­ing Rules

Rubi J. Hudson9 Oct 2024 16:38 UTC
55 points
10 comments17 min readLW link

Sim­plify­ing Cor­rigi­bil­ity – Subagent Cor­rigi­bil­ity Is Not Anti-Natural

Rubi J. Hudson16 Jul 2024 22:44 UTC
44 points
27 comments5 min readLW link

A Ba­sic Eco­nomics-Style Model of AI Ex­is­ten­tial Risk

Rubi J. Hudson24 Jun 2024 20:26 UTC
24 points
3 comments7 min readLW link

The Case for Pre­dic­tive Models

Rubi J. Hudson3 Apr 2024 18:22 UTC
43 points
7 comments8 min readLW link

Search­ing for Search­ing for Search

Rubi J. Hudson14 Feb 2024 23:51 UTC
21 points
4 comments7 min readLW link

Con­di­tional Pre­dic­tion with Zero-Sum Train­ing Solves Self-Fulfilling Prophecies

26 May 2023 17:44 UTC
88 points
13 comments24 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Open prob­lems, Con­clu­sion, and Appendix

10 Feb 2023 19:21 UTC
36 points
3 comments11 min readLW link

Mechanism De­sign for AI Safety—Agenda Creation Retreat

Rubi J. Hudson10 Feb 2023 3:05 UTC
24 points
2 comments1 min readLW link

Con­di­tion­ing Pre­dic­tive Models: De­ploy­ment strategy

9 Feb 2023 20:59 UTC
28 points
0 comments10 min readLW link

Con­di­tion­ing Pre­dic­tive Models: In­ter­ac­tions with other approaches

8 Feb 2023 18:19 UTC
32 points
2 comments11 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Mak­ing in­ner al­ign­ment as easy as possible

7 Feb 2023 20:04 UTC
27 points
2 comments19 min readLW link

Con­di­tion­ing Pre­dic­tive Models: The case for competitiveness

6 Feb 2023 20:08 UTC
20 points
3 comments11 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Outer al­ign­ment via care­ful conditioning

2 Feb 2023 20:28 UTC
72 points
15 comments57 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Large lan­guage mod­els as predictors

2 Feb 2023 20:28 UTC
89 points
4 comments13 min readLW link

Stop-gra­di­ents lead to fixed point predictions

28 Jan 2023 22:47 UTC
37 points
2 comments24 min readLW link

Un­der­speci­fi­ca­tion of Or­a­cle AI

15 Jan 2023 20:10 UTC
30 points
12 comments19 min readLW link

Proper scor­ing rules don’t guaran­tee pre­dict­ing fixed points

16 Dec 2022 18:22 UTC
80 points
8 comments21 min readLW link

Mechanism De­sign for AI Safety—Read­ing Group Curriculum

Rubi J. Hudson25 Oct 2022 3:54 UTC
15 points
3 comments4 min readLW link