Human-AI Safety

TagLast edit: 17 Jul 2023 23:19 UTC by Wei Dai

[Question] Will OpenAI also require a “Super Red Team Agent” for its “Superalignment” Project?

Super AGI30 Mar 2024 5:25 UTC

2 points

2 comments1 min readLW link

A conversation with Claude3 about its consciousness

rife5 Mar 2024 19:44 UTC

−2 points

3 comments1 min readLW link

(i.imgur.com)

Let’s ask some of the largest LLMs for tips and ideas on how to take over the world

Super AGI24 Feb 2024 20:35 UTC

1 point

0 comments7 min readLW link

Gaia Network: An Illustrated Primer

Rafael Kaufmann Nedal and Roman Leventov

18 Jan 2024 18:23 UTC

1 point

2 comments15 min readLW link

Safety First: safety before full alignment. The deontic sufficiency hypothesis.

Chipmonk3 Jan 2024 17:55 UTC

47 points

3 comments3 min readLW link

SociaLLM: proposal for a language model design for personalised apps, social science, and AI safety research

Roman Leventov19 Dec 2023 16:49 UTC

17 points

5 comments3 min readLW link

Apply to the Conceptual Boundaries Workshop for AI Safety

Chipmonk27 Nov 2023 21:04 UTC

48 points

0 comments3 min readLW link

Out of the Box

jesseduffield13 Nov 2023 23:43 UTC

5 points

1 comment7 min readLW link

Public Opinion on AI Safety: AIMS 2023 and 2021 Summary

Jacy Reese Anthis, Janet Pauketat and Ali

25 Sep 2023 18:55 UTC

3 points

2 comments3 min readLW link

(www.sentienceinstitute.org)

A broad basin of attraction around human values?

Wei Dai12 Apr 2022 5:15 UTC

109 points

17 comments2 min readLW link

Morality is Scary

Wei Dai2 Dec 2021 6:35 UTC

193 points

116 comments4 min readLW link 1 review

Two Neglected Problems in Human-AI Safety

Wei Dai16 Dec 2018 22:13 UTC

98 points

24 comments2 min readLW link

Three AI Safety Related Ideas

Wei Dai13 Dec 2018 21:32 UTC

68 points

38 comments2 min readLW link

No comments.