Situational Awareness

TagLast edit: 16 Jun 2023 14:42 UTC by Mateusz Bagiński

Ajeya Cotra uses the term “situational awareness” to refer to a cluster of skills including “being able to refer to and make predictions about yourself as distinct from the rest of the world,” “understanding the forces out in the world that shaped you and how the things that happen to you continue to be influenced by outside forces,” “understanding your position in the world relative to other actors who may have power over you,” “understanding how your actions can affect the outside world including other actors,” etc.

Alternatively, from an ML-perspective, situational awareness can be characterized as a strong form of out-of-context meta-learning applied to situationally-relevant statements.

Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra18 Jul 2022 19:06 UTC

364 points

94 comments75 min readLW link 1 review

Results from the Turing Seminar hackathon

Charbel-Raphaël, jeanne_ and WCargo

7 Dec 2023 14:50 UTC

29 points

1 comment6 min readLW link

Paper: On measuring situational awareness in LLMs

Owain_Evans, Daniel Kokotajlo, Mikita Balesni, Tomek Korbak, lberglund, Asa Cooper Stickland, Meg and Maximilian Kaufmann

4 Sep 2023 12:54 UTC

107 points

16 comments5 min readLW link

(arxiv.org)

Some Quick Follow-Up Experiments to “Taken out of context: On measuring situational awareness in LLMs”

Miles Turpin3 Oct 2023 2:22 UTC

31 points

0 comments9 min readLW link

Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs

L Rudolf L, bilalchughtai, jan betley, kaivu, Jérémy Scheurer, Mikita Balesni, AlexMeinke, Owain_Evans and Marius Hobbhahn

8 Jul 2024 22:24 UTC

99 points

26 comments5 min readLW link

Situational awareness in Large Language Models

Simon Möller3 Mar 2023 18:59 UTC

30 points

2 comments7 min readLW link

Refining the Sharp Left Turn threat model, part 2: applying alignment techniques

Vika, Vikrant Varma, Ramana Kumar and Rohin Shah

25 Nov 2022 14:36 UTC

39 points

9 comments6 min readLW link

(vkrakovna.wordpress.com)

LM Situational Awareness, Evaluation Proposal: Violating Imitation

Jacob Pfau26 Apr 2023 22:53 UTC

16 points

2 comments2 min readLW link

Contingency: A Conceptual Tool from Evolutionary Biology for Alignment

clem_acs12 Jun 2023 20:54 UTC

57 points

2 comments14 min readLW link

(acsresearch.org)

The intelligence-sentience orthogonality thesis

Ben Smith13 Jul 2023 6:55 UTC

18 points

9 comments9 min readLW link

The Zeroth Skillset

katydee30 Jan 2013 12:46 UTC

74 points

109 comments2 min readLW link

Facts vs Interpretations—An Exercise in Cognitive Reframing

Declan Molony27 Feb 2024 7:57 UTC

6 points

0 comments3 min readLW link

Revealing Intentionality In Language Models Through AdaVAE Guided Sampling

jdp20 Oct 2023 7:32 UTC

119 points

15 comments22 min readLW link

Perceptual Blindspots: How to Increase Self-Awareness

Declan Molony26 Mar 2024 5:37 UTC

10 points

4 comments2 min readLW link

LLM Evaluators Recognize and Favor Their Own Generations

Arjun Panickssery, Sam Bowman and Shi Feng

17 Apr 2024 21:09 UTC

44 points

1 comment3 min readLW link

(tiny.cc)

Early situational awareness and its implications, a story

Jacob Pfau6 Feb 2023 20:45 UTC

29 points

6 comments3 min readLW link

No comments.