Remmelt

Karma: 459

Research Coordinator of area “Do Not Build Uncontrollable AI” for AI Safety Camp.

See explainer on why AGI could not be controlled enough to stay safe:
https://www.lesswrong.com/posts/xp6n2MG5vQkPpFEBH/the-control-problem-unsolved-or-unsolvable

The first AI Safety Camp & onwards

Remmelt7 Jun 2018 20:13 UTC

46 points

0 comments8 min readLW link

The Values-to-Actions Decision Chain

Remmelt30 Jun 2018 21:52 UTC

29 points

6 comments10 min readLW link

Delegated agents in practice: How companies might end up selling AI services that act on behalf of consumers and coalitions, and what this implies for safety research

Remmelt26 Nov 2020 11:17 UTC

7 points

3 comments4 min readLW link

Some blindspots in rationality and effective altruism

Remmelt19 Mar 2021 11:40 UTC

37 points

44 comments14 min readLW link

A parable of brightspots and blindspots

Remmelt21 Mar 2021 18:18 UTC

4 points

0 comments3 min readLW link

How teams went about their research at AI Safety Camp edition 5

Remmelt28 Jun 2021 15:15 UTC

24 points

0 comments6 min readLW link

Exploring Democratic Dialogue between Rationality, Silicon Valley, and the Wider World

Remmelt20 Aug 2021 16:04 UTC

−5 points

19 comments13 min readLW link

Why mechanistic interpretability does not and cannot contribute to long-term AGI safety (from messages with a friend)

Remmelt19 Dec 2022 12:02 UTC

−3 points

9 comments31 min readLW link

List #1: Why stopping the development of AGI is hard but doable

Remmelt24 Dec 2022 9:52 UTC

6 points

11 comments5 min readLW link

List #2: Why coordinating to align as humans to not develop AGI is a lot easier than, well… coordinating as humans with AGI coordinating to be aligned with humans

Remmelt24 Dec 2022 9:53 UTC

1 point

0 comments3 min readLW link

List #3: Why not to assume on prior that AGI-alignment workarounds are available

Remmelt24 Dec 2022 9:54 UTC

4 points

1 comment3 min readLW link

Nine Points of Collective Insanity

Remmelt and flandry19

27 Dec 2022 3:14 UTC

−2 points

3 comments1 min readLW link

(mflb.com)

How ‘Human-Human’ dynamics give way to ‘Human-AI’ and then ‘AI-AI’ dynamics

Remmelt and flandry19

27 Dec 2022 3:16 UTC

−2 points

5 comments2 min readLW link

(mflb.com)

Introduction: Bias in Evaluating AGI X-Risks

Remmelt and flandry19

27 Dec 2022 10:27 UTC

1 point

0 comments3 min readLW link

Institutions Cannot Restrain Dark-Triad AI Exploitation

Remmelt and flandry19

27 Dec 2022 10:34 UTC

5 points

0 comments5 min readLW link

(mflb.com)

Mere exposure effect: Bias in Evaluating AGI X-Risks

Remmelt and flandry19

27 Dec 2022 14:05 UTC

0 points

2 comments1 min readLW link

Presumptive Listening: sticking to familiar concepts and missing the outer reasoning paths

Remmelt27 Dec 2022 15:40 UTC

−14 points

8 comments2 min readLW link

(mflb.com)

Bandwagon effect: Bias in Evaluating AGI X-Risks

Remmelt and flandry19

28 Dec 2022 7:54 UTC

−1 points

0 comments1 min readLW link

Reactive devaluation: Bias in Evaluating AGI X-Risks

Remmelt and flandry19

30 Dec 2022 9:02 UTC

−15 points

9 comments1 min readLW link

Curse of knowledge and Naive realism: Bias in Evaluating AGI X-Risks

Remmelt and flandry19

31 Dec 2022 13:33 UTC

−7 points

1 comment1 min readLW link

(www.lesswrong.com)