RSS

Dissent Collusion

Screwtape10 Aug 2022 2:43 UTC
7 points
0 comments3 min readLW link

Pro­posal: Con­sider not us­ing dis­tance-di­rec­tion-di­men­sion words in ab­stract discussions

moridinamael9 Aug 2022 20:44 UTC
42 points
7 comments5 min readLW link

[Question] How would two su­per­in­tel­li­gent AIs in­ter­act, if they are un­al­igned with each other?

Nathan11239 Aug 2022 18:58 UTC
4 points
6 comments1 min readLW link

Disagree­ments about Align­ment: Why, and how, we should try to solve them

ojorgensen9 Aug 2022 18:49 UTC
7 points
0 comments16 min readLW link

Progress links and tweets, 2022-08-09

jasoncrawford9 Aug 2022 17:35 UTC
7 points
2 comments1 min readLW link
(rootsofprogress.org)

Con­tent gen­er­a­tion. Where do we draw the line?

Q Home9 Aug 2022 10:51 UTC
6 points
4 comments2 min readLW link

In­tro­duc­tion to Con­struc­tor Theory

AlfredHarwood9 Aug 2022 10:28 UTC
1 point
0 comments9 min readLW link

[Question] What are some al­ter­na­tives to Shap­ley val­ues which drop ad­di­tivity?

eapi9 Aug 2022 9:16 UTC
6 points
0 comments1 min readLW link
(math.stackexchange.com)

Team Shard Sta­tus Report

David Udell9 Aug 2022 5:33 UTC
35 points
2 comments3 min readLW link

An­nounc­ing: Mechanism De­sign for AI Safety—Read­ing Group

Rubi9 Aug 2022 4:21 UTC
13 points
0 comments4 min readLW link

[Question] What are some Works that might be use­ful but are difficult, so for­got­ten?

TekhneMakre9 Aug 2022 2:22 UTC
8 points
2 comments1 min readLW link

Pro­ject pro­posal: Test­ing the IBP defi­ni­tion of agent

9 Aug 2022 1:09 UTC
9 points
3 comments2 min readLW link

How (not) to choose a re­search project

9 Aug 2022 0:26 UTC
67 points
11 comments7 min readLW link

[Question] Are ya win­ning, son?

Nathan11239 Aug 2022 0:06 UTC
13 points
13 comments2 min readLW link

Gen­eral al­ign­ment properties

TurnTrout8 Aug 2022 23:40 UTC
35 points
1 comment1 min readLW link

En­cul­tured AI, Part 1 Ap­pendix: Rele­vant Re­search Examples

8 Aug 2022 22:44 UTC
9 points
1 comment7 min readLW link

En­cul­tured AI, Part 1: En­abling New Benchmarks

8 Aug 2022 22:44 UTC
56 points
2 comments6 min readLW link

Broad Bas­ins and Data Compression

8 Aug 2022 20:33 UTC
17 points
5 comments7 min readLW link

In­ter­pretabil­ity/​Tool-ness/​Align­ment/​Cor­rigi­bil­ity are not Composable

johnswentworth8 Aug 2022 18:05 UTC
86 points
6 comments3 min readLW link

A suffi­ciently para­noid pa­per­clip maximizer

RomanS8 Aug 2022 11:17 UTC
15 points
10 comments2 min readLW link