Si­tu­a­tional Awareness

TagLast edit: 16 Jun 2023 14:42 UTC by Mateusz Bagiński

Ajeya Cotra uses the term “situational awarenessto refer to a cluster of skills including “being able to refer to and make predictions about yourself as distinct from the rest of the world,” “understanding the forces out in the world that shaped you and how the things that happen to you continue to be influenced by outside forces,” “understanding your position in the world relative to other actors who may have power over you,” “understanding how your actions can affect the outside world including other actors,” etc.

Alternatively, from an ML-perspective, situational awareness can be characterized as a strong form of out-of-context meta-learning applied to situationally-relevant statements.

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

Ajeya Cotra18 Jul 2022 19:06 UTC
364 points
94 comments75 min readLW link1 review

Paper: On mea­sur­ing situ­a­tional aware­ness in LLMs

4 Sep 2023 12:54 UTC
106 points
16 comments5 min readLW link

Some Quick Fol­low-Up Ex­per­i­ments to “Taken out of con­text: On mea­sur­ing situ­a­tional aware­ness in LLMs”

miles3 Oct 2023 2:22 UTC
31 points
0 comments9 min readLW link

Re­sults from the Tur­ing Sem­i­nar hackathon

7 Dec 2023 14:50 UTC
29 points
1 comment6 min readLW link

Early situ­a­tional aware­ness and its im­pli­ca­tions, a story

Jacob Pfau6 Feb 2023 20:45 UTC
29 points
6 comments3 min readLW link

Si­tu­a­tional aware­ness in Large Lan­guage Models

Simon Möller3 Mar 2023 18:59 UTC
28 points
2 comments7 min readLW link

Refin­ing the Sharp Left Turn threat model, part 2: ap­ply­ing al­ign­ment techniques

25 Nov 2022 14:36 UTC
39 points
9 comments6 min readLW link

LM Si­tu­a­tional Aware­ness, Eval­u­a­tion Pro­posal: Vio­lat­ing Imitation

Jacob Pfau26 Apr 2023 22:53 UTC
13 points
2 comments2 min readLW link

Contin­gency: A Con­cep­tual Tool from Evolu­tion­ary Biol­ogy for Alignment

clem_acs12 Jun 2023 20:54 UTC
51 points
2 comments14 min readLW link

The in­tel­li­gence-sen­tience or­thog­o­nal­ity thesis

Ben Smith13 Jul 2023 6:55 UTC
18 points
9 comments9 min readLW link

The Zeroth Skillset

katydee30 Jan 2013 12:46 UTC
74 points
109 comments2 min readLW link

Rein­ter­pret the Past for a Bet­ter Future

Declan Molony27 Feb 2024 7:57 UTC
5 points
0 comments3 min readLW link

Re­veal­ing In­ten­tion­al­ity In Lan­guage Models Through AdaVAE Guided Sampling

jdp20 Oct 2023 7:32 UTC
119 points
15 comments22 min readLW link

Per­cep­tual Blindspots: How to In­crease Self-Awareness

Declan Molony26 Mar 2024 5:37 UTC
9 points
2 comments2 min readLW link

LLM Eval­u­a­tors Rec­og­nize and Fa­vor Their Own Generations

17 Apr 2024 21:09 UTC
44 points
1 comment3 min readLW link
No comments.