Situational Awareness

TagLast edit: 16 Jun 2023 14:42 UTC by Mateusz Bagiński

Ajeya Cotra uses the term “situational awareness” to refer to a cluster of skills including “being able to refer to and make predictions about yourself as distinct from the rest of the world,” “understanding the forces out in the world that shaped you and how the things that happen to you continue to be influenced by outside forces,” “understanding your position in the world relative to other actors who may have power over you,” “understanding how your actions can affect the outside world including other actors,” etc.

Alternatively, from an ML-perspective, situational awareness can be characterized as a strong form of out-of-context meta-learning applied to situationally-relevant statements.

The Zeroth Skillset

katydee30 Jan 2013 12:46 UTC

74 points

109 comments2 min readLW link

Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra18 Jul 2022 19:06 UTC

364 points

94 comments75 min readLW link 1 review

Refining the Sharp Left Turn threat model, part 2: applying alignment techniques

Vika, Vikrant Varma, Ramana Kumar and Rohin Shah

25 Nov 2022 14:36 UTC

39 points

9 comments6 min readLW link

(vkrakovna.wordpress.com)

Early situational awareness and its implications, a story

Jacob Pfau6 Feb 2023 20:45 UTC

29 points

6 comments3 min readLW link

Situational awareness in Large Language Models

Simon Möller3 Mar 2023 18:59 UTC

28 points

2 comments7 min readLW link

LM Situational Awareness, Evaluation Proposal: Violating Imitation

Jacob Pfau26 Apr 2023 22:53 UTC

13 points

2 comments2 min readLW link

Contingency: A Conceptual Tool from Evolutionary Biology for Alignment

clem_acs12 Jun 2023 20:54 UTC

51 points

2 comments14 min readLW link

(acsresearch.org)

The intelligence-sentience orthogonality thesis

Ben Smith13 Jul 2023 6:55 UTC

18 points

9 comments9 min readLW link

Paper: On measuring situational awareness in LLMs

Owain_Evans, Daniel Kokotajlo, Mikita Balesni, Tomek Korbak, lberglund, Asa Cooper Stickland, Meg and Maximilian Kaufmann

4 Sep 2023 12:54 UTC

106 points

16 comments5 min readLW link

(arxiv.org)

Some Quick Follow-Up Experiments to “Taken out of context: On measuring situational awareness in LLMs”

miles3 Oct 2023 2:22 UTC

31 points

0 comments9 min readLW link

Revealing Intentionality In Language Models Through AdaVAE Guided Sampling

jdp20 Oct 2023 7:32 UTC

117 points

14 comments22 min readLW link

Results from the Turing Seminar hackathon

Charbel-Raphaël, jeanne_ and WCargo

7 Dec 2023 14:50 UTC

29 points

1 comment6 min readLW link

Facts vs Interpretations

Declan Molony27 Feb 2024 7:57 UTC

5 points

0 comments3 min readLW link

Perceptual Blindspots

Declan Molony26 Mar 2024 5:37 UTC

9 points

2 comments2 min readLW link

No comments.