RSS

Si­mu­la­tor Theory

TagLast edit: 14 Feb 2023 17:08 UTC by Jozdien

Simulator theory in the context of AI refers to an ontology or frame for understanding the working of large generative models, such as the GPT series from OpenAI. Broadly it views these models as simulating a learned distribution with various degrees of fidelity, which in the case of language models trained on a large corpus of text is the mechanics underlying our world.

It can also refer to an alignment research agenda, that deals with better understanding simulator conditionals, effects of downstream training, alignment-relevant properties such as myopia and agency in the context of language models, and using them as alignment research accelerators. See also: Cyborgism

Simulators

janus2 Sep 2022 12:45 UTC
572 points
110 comments41 min readLW link
(generative.ink)

Con­di­tion­ing Pre­dic­tive Models: Large lan­guage mod­els as predictors

2 Feb 2023 20:28 UTC
82 points
4 comments13 min readLW link

The Waluigi Effect (mega-post)

Cleo Nardo3 Mar 2023 3:22 UTC
568 points
164 comments16 min readLW link

Con­di­tion­ing Gen­er­a­tive Models for Alignment

Jozdien18 Jul 2022 7:11 UTC
52 points
8 comments20 min readLW link

‘simu­la­tor’ fram­ing and con­fu­sions about LLMs

Beth Barnes31 Dec 2022 23:38 UTC
97 points
11 comments4 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #1 Back­ground & shared assumptions

2 Jan 2023 23:48 UTC
46 points
4 comments3 min readLW link

Agents vs. Pre­dic­tors: Con­crete differ­en­ti­at­ing factors

evhub24 Feb 2023 23:50 UTC
32 points
3 comments4 min readLW link

Im­pli­ca­tions of simulators

ThomasW7 Jan 2023 0:37 UTC
14 points
0 comments12 min readLW link

Si­mu­lacra are Things

janus8 Jan 2023 23:03 UTC
52 points
7 comments2 min readLW link

In­ner Misal­ign­ment in “Si­mu­la­tor” LLMs

Adam Scherlis31 Jan 2023 8:33 UTC
84 points
11 comments4 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Outer al­ign­ment via care­ful conditioning

2 Feb 2023 20:28 UTC
72 points
13 comments53 min readLW link

Con­di­tion­ing Pre­dic­tive Models: De­ploy­ment strategy

9 Feb 2023 20:59 UTC
25 points
0 comments10 min readLW link

Two prob­lems with ‘Si­mu­la­tors’ as a frame

ryan_greenblatt17 Feb 2023 23:34 UTC
76 points
13 comments5 min readLW link

You’re not a simu­la­tion, ’cause you’re hallucinating

Stuart_Armstrong21 Feb 2023 12:12 UTC
24 points
6 comments1 min readLW link

Why do we as­sume there is a “real” shog­goth be­hind the LLM? Why not masks all the way down?

Robert_AIZI9 Mar 2023 17:28 UTC
58 points
41 comments2 min readLW link

The al­gorithm isn’t do­ing X, it’s just do­ing Y.

Cleo Nardo16 Mar 2023 23:28 UTC
53 points
42 comments5 min readLW link

Su­per-Luigi = Luigi + (Luigi—Waluigi)

Alexei17 Mar 2023 15:27 UTC
15 points
8 comments1 min readLW link

Re­marks 1–18 on GPT (com­pressed)

Cleo Nardo20 Mar 2023 22:27 UTC
115 points
23 comments31 min readLW link

[ASoT] Fine­tun­ing, RL, and GPT’s world prior

Jozdien2 Dec 2022 16:33 UTC
40 points
8 comments5 min readLW link

When can a mimic sur­prise you? Why gen­er­a­tive mod­els han­dle seem­ingly ill-posed problems

David Johnston5 Nov 2022 13:19 UTC
8 points
4 comments16 min readLW link

Con­di­tion­ing Gen­er­a­tive Models

Adam Jermyn25 Jun 2022 22:15 UTC
23 points
18 comments10 min readLW link

AGI-level rea­soner will ap­pear sooner than an agent; what the hu­man­ity will do with this rea­soner is critical

Roman Leventov30 Jul 2022 20:56 UTC
24 points
10 comments1 min readLW link

Si­mu­la­tors, con­straints, and goal ag­nos­ti­cism: por­bynotes vol. 1

porby23 Nov 2022 4:22 UTC
37 points
2 comments35 min readLW link

Pro­saic mis­al­ign­ment from the Solomonoff Predictor

Cleo Nardo9 Dec 2022 17:53 UTC
35 points
2 comments5 min readLW link

Steer­ing Be­havi­our: Test­ing for (Non-)My­opia in Lan­guage Models

5 Dec 2022 20:28 UTC
38 points
17 comments10 min readLW link

The Limit of Lan­guage Models

DragonGod6 Jan 2023 23:53 UTC
40 points
26 comments4 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #2 Semiotic physics—revamped

27 Feb 2023 0:25 UTC
20 points
22 comments13 min readLW link

[Question] Could Si­mu­lat­ing an AGI Tak­ing Over the World Ac­tu­ally Lead to a LLM Tak­ing Over the World?

simeon_c13 Jan 2023 6:33 UTC
15 points
1 comment1 min readLW link

[ASoT] Si­mu­la­tors show us be­havi­oural prop­er­ties by default

Jozdien13 Jan 2023 18:42 UTC
26 points
1 comment3 min readLW link

Un­der­speci­fi­ca­tion of Or­a­cle AI

15 Jan 2023 20:10 UTC
30 points
12 comments19 min readLW link

Gra­di­ent Filtering

18 Jan 2023 20:09 UTC
50 points
16 comments13 min readLW link

Con­di­tion­ing Pre­dic­tive Models: The case for competitiveness

6 Feb 2023 20:08 UTC
20 points
0 comments11 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Mak­ing in­ner al­ign­ment as easy as possible

7 Feb 2023 20:04 UTC
29 points
2 comments19 min readLW link

Con­di­tion­ing Pre­dic­tive Models: In­ter­ac­tions with other approaches

8 Feb 2023 18:19 UTC
29 points
2 comments11 min readLW link

Cyborgism

10 Feb 2023 14:47 UTC
296 points
41 comments35 min readLW link

A note on ‘semiotic physics’

metasemi11 Feb 2023 5:12 UTC
11 points
12 comments6 min readLW link

Pre­train­ing Lan­guage Models with Hu­man Preferences

21 Feb 2023 17:57 UTC
129 points
16 comments11 min readLW link

Im­plied “util­ities” of simu­la­tors are broad, dense, and shallow

porby1 Mar 2023 3:23 UTC
36 points
2 comments3 min readLW link

In­stru­men­tal­ity makes agents agenty

porby21 Feb 2023 4:28 UTC
14 points
4 comments6 min readLW link

Si­tu­a­tional aware­ness in Large Lan­guage Models

Simon Möller3 Mar 2023 18:59 UTC
21 points
1 comment7 min readLW link
No comments.