RSS

Lan­guage Models

TagLast edit: 24 Sep 2021 14:16 UTC by plex

Language Models are a class of AI trained on text, usually to predict the next word or a word which has been obscured. They have the ability to generate novel prose or code based on an initial prompt, which gives rise to a kind of natural language programming called prompt engineering. The most popular architecture for very large language models is called a transformer, which follows consistent scaling laws with respect to the size of the model being trained, meaning that a larger model trained with the same amount of compute will produce results which are better by a predictable amount (when measured by the ‘perplexity’, or how surprised the AI is by a test set of human-generated text).

See also

In­verse Scal­ing Prize: Round 1 Winners

26 Sep 2022 19:57 UTC
90 points
16 comments4 min readLW link
(irmckenzie.co.uk)

Simulators

janus2 Sep 2022 12:45 UTC
572 points
111 comments41 min readLW link
(generative.ink)

Trans­former Circuits

evhub22 Dec 2021 21:09 UTC
143 points
4 comments3 min readLW link
(transformer-circuits.pub)

Truth­ful LMs as a warm-up for al­igned AGI

Jacob_Hilton17 Jan 2022 16:49 UTC
65 points
14 comments13 min readLW link

Test­ing PaLM prompts on GPT3

Yitz6 Apr 2022 5:21 UTC
103 points
15 comments8 min readLW link

Re­sults from the lan­guage model hackathon

Esben Kran10 Oct 2022 8:29 UTC
22 points
1 comment4 min readLW link

LLMs may cap­ture key com­po­nents of hu­man agency

catubc17 Nov 2022 20:14 UTC
25 points
0 comments4 min readLW link

In­verse Scal­ing Prize: Se­cond Round Winners

24 Jan 2023 20:12 UTC
56 points
16 comments15 min readLW link

Cog­ni­tive Bi­ases in Large Lan­guage Models

Jan25 Sep 2021 20:59 UTC
18 points
3 comments12 min readLW link
(universalprior.substack.com)

NVIDIA and Microsoft re­leases 530B pa­ram­e­ter trans­former model, Me­ga­tron-Tur­ing NLG

Ozyrus11 Oct 2021 15:28 UTC
51 points
36 comments1 min readLW link
(developer.nvidia.com)

NLP Po­si­tion Paper: When Com­bat­ting Hype, Pro­ceed with Caution

Sam Bowman15 Oct 2021 20:57 UTC
46 points
15 comments1 min readLW link

Fore­cast­ing progress in lan­guage models

28 Oct 2021 20:40 UTC
62 points
6 comments11 min readLW link
(www.metaculus.com)

Deep­mind’s Go­pher—more pow­er­ful than GPT-3

hath8 Dec 2021 17:06 UTC
86 points
27 comments1 min readLW link
(deepmind.com)

Teaser: Hard-cod­ing Trans­former Models

MadHatter12 Dec 2021 22:04 UTC
73 points
19 comments1 min readLW link

Lan­guage Model Align­ment Re­search Internships

Ethan Perez13 Dec 2021 19:53 UTC
74 points
1 comment1 min readLW link

Un­der­stand­ing the ten­sor product for­mu­la­tion in Trans­former Circuits

Tom Lieberum24 Dec 2021 18:05 UTC
16 points
2 comments3 min readLW link

A one-ques­tion Tur­ing test for GPT-3

22 Jan 2022 18:17 UTC
84 points
25 comments5 min readLW link

[ASoT] Some thoughts about LM monologue limi­ta­tions and ELK

leogao30 Mar 2022 14:26 UTC
10 points
0 comments2 min readLW link

Pro­ce­du­rally eval­u­at­ing fac­tual ac­cu­racy: a re­quest for research

Jacob_Hilton30 Mar 2022 16:37 UTC
24 points
2 comments6 min readLW link

[Link] Train­ing Com­pute-Op­ti­mal Large Lan­guage Models

nostalgebraist31 Mar 2022 18:01 UTC
51 points
23 comments1 min readLW link
(arxiv.org)

In­flec­tion AI: New startup re­lated to lan­guage models

Nisan2 Apr 2022 5:35 UTC
21 points
1 comment1 min readLW link

New Scal­ing Laws for Large Lan­guage Models

1a3orn1 Apr 2022 20:41 UTC
239 points
21 comments5 min readLW link

How to train your trans­former

p.b.7 Apr 2022 9:34 UTC
6 points
0 comments8 min readLW link

Lan­guage Model Tools for Align­ment Research

Logan Riggs8 Apr 2022 17:32 UTC
28 points
0 comments2 min readLW link

AMA Con­jec­ture, A New Align­ment Startup

adamShimi9 Apr 2022 9:43 UTC
47 points
42 comments1 min readLW link

[Linkpost] New multi-modal Deep­mind model fus­ing Chin­chilla with images and videos

p.b.30 Apr 2022 3:47 UTC
53 points
18 comments1 min readLW link

Paper: Teach­ing GPT3 to ex­press un­cer­tainty in words

Owain_Evans31 May 2022 13:27 UTC
96 points
7 comments4 min readLW link

Boot­strap­ping Lan­guage Models

harsimony27 May 2022 19:43 UTC
7 points
5 comments2 min readLW link

Lamda is not an LLM

Kevin19 Jun 2022 11:13 UTC
7 points
10 comments1 min readLW link
(www.wired.com)

Con­di­tion­ing Gen­er­a­tive Models

Adam Jermyn25 Jun 2022 22:15 UTC
23 points
18 comments10 min readLW link

Assess­ing AlephAlphas Mul­ti­modal Model

p.b.28 Jun 2022 9:28 UTC
30 points
5 comments3 min readLW link

[Linkpost] Solv­ing Quan­ti­ta­tive Rea­son­ing Prob­lems with Lan­guage Models

Yitz30 Jun 2022 18:58 UTC
76 points
15 comments2 min readLW link
(storage.googleapis.com)

Minerva

Algon1 Jul 2022 20:06 UTC
35 points
6 comments2 min readLW link
(ai.googleblog.com)

Deep learn­ing cur­ricu­lum for large lan­guage model alignment

Jacob_Hilton13 Jul 2022 21:58 UTC
53 points
3 comments1 min readLW link
(github.com)

Con­di­tion­ing Gen­er­a­tive Models for Alignment

Jozdien18 Jul 2022 7:11 UTC
52 points
8 comments20 min readLW link

[Question] Im­pact of ” ‘Let’s think step by step’ is all you need”?

yrimon24 Jul 2022 20:59 UTC
20 points
2 comments1 min readLW link

chin­chilla’s wild implications

nostalgebraist31 Jul 2022 1:18 UTC
393 points
116 comments11 min readLW link

Emer­gent Abil­ities of Large Lan­guage Models [Linkpost]

aogara10 Aug 2022 18:02 UTC
25 points
2 comments1 min readLW link
(arxiv.org)

Lan­guage mod­els seem to be much bet­ter than hu­mans at next-to­ken prediction

11 Aug 2022 17:45 UTC
164 points
57 comments13 min readLW link

A lit­tle play­ing around with Blen­der­bot3

Nathan Helm-Burger12 Aug 2022 16:06 UTC
9 points
0 comments1 min readLW link

[Question] Are lan­guage mod­els close to the su­per­hu­man level in philos­o­phy?

Roman Leventov19 Aug 2022 4:43 UTC
5 points
2 comments2 min readLW link

A Test for Lan­guage Model Consciousness

Ethan Perez25 Aug 2022 19:41 UTC
17 points
14 comments10 min readLW link

Strat­egy For Con­di­tion­ing Gen­er­a­tive Models

1 Sep 2022 4:34 UTC
31 points
4 comments18 min readLW link

Alex­aTM − 20 Billion Pa­ram­e­ter Model With Im­pres­sive Performance

ViktorThink9 Sep 2022 21:46 UTC
5 points
0 comments1 min readLW link

Sparse tri­nary weighted RNNs as a path to bet­ter lan­guage model interpretability

Am8ryllis17 Sep 2022 19:48 UTC
19 points
13 comments3 min readLW link

Steer­ing Be­havi­our: Test­ing for (Non-)My­opia in Lan­guage Models

5 Dec 2022 20:28 UTC
38 points
17 comments10 min readLW link

Did ChatGPT just gaslight me?

ThomasW1 Dec 2022 5:41 UTC
124 points
45 comments9 min readLW link
(aiwatchtower.substack.com)

Chat GPT’s views on Me­ta­physics and Ethics

Cole Killian3 Dec 2022 18:12 UTC
5 points
3 comments1 min readLW link
(twitter.com)

[Question] Does a LLM have a util­ity func­tion?

Dagon9 Dec 2022 17:19 UTC
16 points
11 comments1 min readLW link

Dis­cov­er­ing La­tent Knowl­edge in Lan­guage Models Without Supervision

Xodarap14 Dec 2022 12:32 UTC
45 points
1 comment1 min readLW link
(arxiv.org)

Take 11: “Align­ing lan­guage mod­els” should be weirder.

Charlie Steiner18 Dec 2022 14:14 UTC
31 points
0 comments2 min readLW link

Dis­cov­er­ing Lan­guage Model Be­hav­iors with Model-Writ­ten Evaluations

20 Dec 2022 20:08 UTC
91 points
33 comments1 min readLW link
(www.anthropic.com)

Pod­cast: Tam­era Lan­ham on AI risk, threat mod­els, al­ign­ment pro­pos­als, ex­ter­nal­ized rea­son­ing over­sight, and work­ing at Anthropic

Akash20 Dec 2022 21:39 UTC
18 points
2 comments11 min readLW link

Mlyyrczo

lsusr26 Dec 2022 7:58 UTC
38 points
14 comments3 min readLW link

‘simu­la­tor’ fram­ing and con­fu­sions about LLMs

Beth Barnes31 Dec 2022 23:38 UTC
97 points
11 comments4 min readLW link

Pro­posal for In­duc­ing Steganog­ra­phy in LMs

Logan Riggs12 Jan 2023 22:15 UTC
18 points
2 comments2 min readLW link

[Linkpost] Scal­ing Laws for Gen­er­a­tive Mixed-Mo­dal Lan­guage Models

Amal 12 Jan 2023 14:24 UTC
15 points
2 comments1 min readLW link
(arxiv.org)

[Question] Ba­sic Ques­tion about LLMs: how do they know what task to perform

Garak14 Jan 2023 13:13 UTC
1 point
3 comments1 min readLW link

Un­der­stand­ing the diffu­sion of large lan­guage mod­els: summary

Ben Cottier16 Jan 2023 1:37 UTC
26 points
1 comment1 min readLW link

Lan­guage mod­els can gen­er­ate su­pe­rior text com­pared to their input

ChristianKl17 Jan 2023 10:57 UTC
46 points
28 comments1 min readLW link

Thoughts on re­fus­ing harm­ful re­quests to large lan­guage models

William_S19 Jan 2023 19:49 UTC
30 points
4 comments2 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Large lan­guage mod­els as predictors

2 Feb 2023 20:28 UTC
82 points
4 comments13 min readLW link

Con­di­tion­ing Pre­dic­tive Models: Outer al­ign­ment via care­ful conditioning

2 Feb 2023 20:28 UTC
72 points
13 comments53 min readLW link

Con­di­tion­ing Pre­dic­tive Models: The case for competitiveness

6 Feb 2023 20:08 UTC
20 points
0 comments11 min readLW link

SolidGoldMag­ikarp II: tech­ni­cal de­tails and more re­cent findings

6 Feb 2023 19:09 UTC
101 points
44 comments13 min readLW link

LLM Ba­sics: Embed­ding Spaces—Trans­former To­ken Vec­tors Are Not Points in Space

NickyP13 Feb 2023 18:52 UTC
42 points
9 comments15 min readLW link

Con­di­tion­ing Pre­dic­tive Models: In­ter­ac­tions with other approaches

8 Feb 2023 18:19 UTC
29 points
2 comments11 min readLW link

Notes on the Math­e­mat­ics of LLM Architectures

Spencer Becker-Kahn9 Feb 2023 1:45 UTC
10 points
2 comments1 min readLW link
(drive.google.com)

SolidGoldMag­ikarp (plus, prompt gen­er­a­tion)

5 Feb 2023 22:02 UTC
646 points
194 comments12 min readLW link

Con­di­tion­ing Pre­dic­tive Models: De­ploy­ment strategy

9 Feb 2023 20:59 UTC
25 points
0 comments10 min readLW link

In Defense of Chat­bot Romance

Kaj_Sotala11 Feb 2023 14:30 UTC
116 points
40 comments11 min readLW link
(kajsotala.fi)

[Question] Is In­struc­tGPT Fol­low­ing In­struc­tions in Other Lan­guages Sur­pris­ing?

DragonGod13 Feb 2023 23:26 UTC
39 points
15 comments1 min readLW link

Bing Chat is blatantly, ag­gres­sively misaligned

evhub15 Feb 2023 5:29 UTC
390 points
164 comments2 min readLW link

What do lan­guage mod­els know about fic­tional char­ac­ters?

skybrian22 Feb 2023 5:58 UTC
6 points
0 comments4 min readLW link

Meta “open sources” LMs com­pet­i­tive with Chin­chilla, PaLM, and code-davinci-002 (Paper)

LawrenceC24 Feb 2023 19:57 UTC
38 points
19 comments1 min readLW link
(research.facebook.com)

A Pro­posed Test to Deter­mine the Ex­tent to Which Large Lan­guage Models Un­der­stand the Real World

Bruce G24 Feb 2023 20:20 UTC
4 points
7 comments8 min readLW link

Evil au­to­com­plete: Ex­is­ten­tial Risk and Next-To­ken Predictors

Yitz28 Feb 2023 8:47 UTC
9 points
4 comments5 min readLW link

Philos­o­phy of lan­guage should be used to ad­dress LLM intelligence

sun_harmonics1 Mar 2023 18:38 UTC
2 points
0 comments1 min readLW link

The Waluigi Effect (mega-post)

Cleo Nardo3 Mar 2023 3:22 UTC
568 points
164 comments16 min readLW link

Google’s PaLM-E: An Em­bod­ied Mul­ti­modal Lan­guage Model

SandXbox7 Mar 2023 4:11 UTC
86 points
7 comments1 min readLW link
(palm-e.github.io)

Lan­guage mod­els are not in­her­ently safe

Loppukilpailija7 Mar 2023 21:15 UTC
11 points
1 comment3 min readLW link

[Question] Can Less­wrong test a LLM trans­la­tion API for Ja­panese in time?

trevor13 Mar 2023 4:15 UTC
5 points
3 comments1 min readLW link

GPT can write Quines now (GPT-4)

Andrew_Critch14 Mar 2023 19:18 UTC
97 points
29 comments1 min readLW link

No­kens: A po­ten­tial method of in­ves­ti­gat­ing glitch tokens

Hoagy15 Mar 2023 16:23 UTC
16 points
0 comments4 min readLW link

[Question] Will 2023 be the last year you can write short sto­ries and re­ceive most of the in­tel­lec­tual credit for writ­ing them?

lc16 Mar 2023 21:36 UTC
17 points
10 comments1 min readLW link

Su­per-Luigi = Luigi + (Luigi—Waluigi)

Alexei17 Mar 2023 15:27 UTC
16 points
8 comments1 min readLW link

What does it mean for an LLM such as GPT to be al­igned /​ good /​ pos­i­tive im­pact?

PashaKamyshev20 Mar 2023 9:21 UTC
4 points
3 comments10 min readLW link

RLHF does not ap­pear to differ­en­tially cause mode-collapse

20 Mar 2023 15:39 UTC
90 points
8 comments3 min readLW link

Thoughts on the Align­ment Im­pli­ca­tions of Scal­ing Lan­guage Models

leogao2 Jun 2021 21:32 UTC
82 points
11 comments17 min readLW link

[AN #144]: How lan­guage mod­els can also be fine­tuned for non-lan­guage tasks

Rohin Shah2 Apr 2021 17:20 UTC
19 points
0 comments6 min readLW link
(mailchi.mp)

How truth­ful is GPT-3? A bench­mark for lan­guage models

Owain_Evans16 Sep 2021 10:09 UTC
56 points
24 comments6 min readLW link

[Question] How does OpenAI’s lan­guage model af­fect our AI timeline es­ti­mates?

jimrandomh15 Feb 2019 3:11 UTC
50 points
7 comments1 min readLW link

Build­ing AGI Us­ing Lan­guage Models

leogao9 Nov 2020 16:33 UTC
11 points
1 comment1 min readLW link
(leogao.dev)

Suffi­ciently Ad­vanced Lan­guage Models Can Do Re­in­force­ment Learning

Zachary Robertson2 Aug 2020 15:32 UTC
21 points
7 comments7 min readLW link

The case for al­ign­ing nar­rowly su­per­hu­man models

Ajeya Cotra5 Mar 2021 22:29 UTC
190 points
75 comments38 min readLW link1 review

The Codex Skep­tic FAQ

Michaël Trazzi24 Aug 2021 16:01 UTC
49 points
24 comments2 min readLW link

On lan­guage mod­el­ing and fu­ture ab­stract rea­son­ing research

alexlyzhov25 Mar 2021 17:43 UTC
3 points
1 comment1 min readLW link
(docs.google.com)

Agen­tic Lan­guage Model Memes

FactorialCode1 Aug 2020 18:03 UTC
16 points
1 comment2 min readLW link

Struc­tured Tasks for Lan­guage Models

Zachary Robertson29 Jul 2020 14:17 UTC
5 points
0 comments1 min readLW link

[AN #164]: How well can lan­guage mod­els write code?

Rohin Shah15 Sep 2021 17:20 UTC
13 points
7 comments9 min readLW link
(mailchi.mp)

[AN #113]: Check­ing the eth­i­cal in­tu­itions of large lan­guage models

Rohin Shah19 Aug 2020 17:10 UTC
23 points
0 comments9 min readLW link
(mailchi.mp)

New GPT-3 competitor

Quintin Pope12 Aug 2021 7:05 UTC
32 points
10 comments1 min readLW link

OpenAI Codex: First Impressions

specbug13 Aug 2021 16:52 UTC
49 points
8 comments4 min readLW link
(sixeleven.in)

AMA on Truth­ful AI: Owen Cot­ton-Bar­ratt, Owain Evans & co-authors

Owain_Evans22 Oct 2021 16:23 UTC
31 points
15 comments1 min readLW link

Truth­ful and hon­est AI

29 Oct 2021 7:28 UTC
41 points
1 comment13 min readLW link

larger lan­guage mod­els may dis­ap­point you [or, an eter­nally un­finished draft]

nostalgebraist26 Nov 2021 23:08 UTC
251 points
31 comments31 min readLW link2 reviews

Hard-Cod­ing Neu­ral Computation

MadHatter13 Dec 2021 4:35 UTC
32 points
8 comments27 min readLW link

Ev­i­dence Sets: Towards In­duc­tive-Bi­ases based Anal­y­sis of Pro­saic AGI

bayesian_kitten16 Dec 2021 22:41 UTC
22 points
10 comments21 min readLW link

GPT-3: a dis­ap­point­ing paper

nostalgebraist29 May 2020 19:06 UTC
65 points
44 comments8 min readLW link1 review

A Sum­mary Of An­thropic’s First Paper

Sam Ringer30 Dec 2021 0:48 UTC
82 points
1 comment8 min readLW link

How I’m think­ing about GPT-N

delton13717 Jan 2022 17:11 UTC
47 points
21 comments18 min readLW link

Ex­trap­o­lat­ing GPT-N performance

Lanrian18 Dec 2020 21:41 UTC
103 points
31 comments25 min readLW link1 review

2+2: On­tolog­i­cal Framework

Lyrialtus1 Feb 2022 1:07 UTC
−15 points
2 comments12 min readLW link

EleutherAI’s GPT-NeoX-20B release

leogao10 Feb 2022 6:56 UTC
30 points
3 comments1 min readLW link
(eaidata.bmk.sh)

New GPT3 Im­pres­sive Ca­pa­bil­ities—In­struc­tGPT3 [1/​2]

simeon_c13 Mar 2022 10:58 UTC
72 points
10 comments7 min readLW link

Gears-Level Men­tal Models of Trans­former Interpretability

KevinRoWang29 Mar 2022 20:09 UTC
60 points
4 comments6 min readLW link

My agenda for re­search into trans­former ca­pa­bil­ities—Introduction

p.b.5 Apr 2022 21:23 UTC
11 points
1 comment3 min readLW link

Re­search agenda: Can trans­form­ers do sys­tem 2 think­ing?

p.b.6 Apr 2022 13:31 UTC
20 points
0 comments2 min readLW link

PaLM in “Ex­trap­o­lat­ing GPT-N perfor­mance”

Lanrian6 Apr 2022 13:05 UTC
83 points
19 comments2 min readLW link

Re­search agenda—Build­ing a multi-modal chess-lan­guage model

p.b.7 Apr 2022 12:25 UTC
8 points
2 comments2 min readLW link

Is GPT3 a Good Ra­tion­al­ist? - In­struc­tGPT3 [2/​2]

simeon_c7 Apr 2022 13:46 UTC
11 points
0 comments7 min readLW link

Elicit: Lan­guage Models as Re­search Assistants

9 Apr 2022 14:56 UTC
70 points
7 comments13 min readLW link

[Question] “Frag­ility of Value” vs. LLMs

Not Relevant13 Apr 2022 2:02 UTC
32 points
32 comments1 min readLW link

Why Copi­lot Ac­cel­er­ates Timelines

Michaël Trazzi26 Apr 2022 22:06 UTC
35 points
14 comments7 min readLW link

A pos­si­ble check against mo­ti­vated rea­son­ing us­ing elicit.org

david reinstein18 May 2022 20:52 UTC
4 points
0 comments1 min readLW link

RL with KL penalties is bet­ter seen as Bayesian inference

25 May 2022 9:23 UTC
94 points
15 comments12 min readLW link

QNR prospects are im­por­tant for AI al­ign­ment research

Eric Drexler3 Feb 2022 15:20 UTC
82 points
10 comments11 min readLW link

Who mod­els the mod­els that model mod­els? An ex­plo­ra­tion of GPT-3′s in-con­text model fit­ting ability

Lovre7 Jun 2022 19:37 UTC
112 points
15 comments9 min readLW link

[linkpost] The fi­nal AI bench­mark: BIG-bench

RomanS10 Jun 2022 8:53 UTC
25 points
21 comments1 min readLW link

In­ves­ti­gat­ing causal un­der­stand­ing in LLMs

14 Jun 2022 13:57 UTC
28 points
6 comments13 min readLW link

Con­tra Hofs­tadter on GPT-3 Nonsense

rictic15 Jun 2022 21:53 UTC
235 points
22 comments2 min readLW link

Causal con­fu­sion as an ar­gu­ment against the scal­ing hypothesis

20 Jun 2022 10:54 UTC
84 points
30 comments18 min readLW link

Yann LeCun, A Path Towards Au­tonomous Ma­chine In­tel­li­gence [link]

Bill Benzon27 Jun 2022 23:29 UTC
5 points
1 comment1 min readLW link

An­nounc­ing the In­verse Scal­ing Prize ($250k Prize Pool)

27 Jun 2022 15:58 UTC
168 points
14 comments7 min readLW link

GPT-3 Catch­ing Fish in Morse Code

Megan Kinniment30 Jun 2022 21:22 UTC
114 points
27 comments8 min readLW link

Train­ing goals for large lan­guage models

Johannes Treutlein18 Jul 2022 7:09 UTC
28 points
5 comments19 min readLW link

Help ARC eval­u­ate ca­pa­bil­ities of cur­rent lan­guage mod­els (still need peo­ple)

Beth Barnes19 Jul 2022 4:55 UTC
96 points
6 comments2 min readLW link

Con­di­tion­ing Gen­er­a­tive Models with Restrictions

Adam Jermyn21 Jul 2022 20:33 UTC
17 points
4 comments8 min readLW link

Ex­ter­nal­ized rea­son­ing over­sight: a re­search di­rec­tion for lan­guage model alignment

tamera3 Aug 2022 12:03 UTC
106 points
22 comments6 min readLW link

Trans­former lan­guage mod­els are do­ing some­thing more general

Numendil3 Aug 2022 21:13 UTC
53 points
6 comments2 min readLW link

Con­di­tion­ing, Prompts, and Fine-Tuning

Adam Jermyn17 Aug 2022 20:52 UTC
37 points
9 comments4 min readLW link

Google AI in­te­grates PaLM with robotics: SayCan up­date [Linkpost]

Evan R. Murphy24 Aug 2022 20:54 UTC
25 points
0 comments1 min readLW link
(sites.research.google)

Is train­ing data go­ing to be diluted by AI-gen­er­ated con­tent?

Hannes Thurnherr7 Sep 2022 18:13 UTC
10 points
7 comments1 min readLW link

How should Deep­Mind’s Chin­chilla re­vise our AI fore­casts?

Cleo Nardo15 Sep 2022 17:54 UTC
35 points
12 comments13 min readLW link

Take­aways from our ro­bust in­jury clas­sifier pro­ject [Red­wood Re­search]

dmz17 Sep 2022 3:55 UTC
137 points
10 comments6 min readLW link

[Question] If we have Hu­man-level chat­bots, won’t we end up be­ing ruled by pos­si­ble peo­ple?

Erlja Jkdf.20 Sep 2022 13:59 UTC
5 points
13 comments1 min readLW link

An Un­ex­pected GPT-3 De­ci­sion in a Sim­ple Gam­ble

hatta_afiq25 Sep 2022 16:46 UTC
8 points
4 comments1 min readLW link

Re­call and Re­gur­gi­ta­tion in GPT2

Megan Kinniment3 Oct 2022 19:35 UTC
41 points
1 comment26 min readLW link

Brief Notes on Transformers

Adam Jermyn26 Sep 2022 14:46 UTC
33 points
2 comments2 min readLW link

Paper: Large Lan­guage Models Can Self-im­prove [Linkpost]

Evan R. Murphy2 Oct 2022 1:29 UTC
52 points
14 comments1 min readLW link
(openreview.net)

Smoke with­out fire is scary

Adam Jermyn4 Oct 2022 21:08 UTC
49 points
22 comments4 min readLW link

They gave LLMs ac­cess to physics simulators

ryan_b17 Oct 2022 21:21 UTC
50 points
18 comments1 min readLW link
(arxiv.org)

Is GPT-N bounded by hu­man ca­pa­bil­ities? No.

Cleo Nardo17 Oct 2022 23:26 UTC
38 points
7 comments2 min readLW link

Learn­ing so­cietal val­ues from law as part of an AGI al­ign­ment strategy

John Nay21 Oct 2022 2:03 UTC
3 points
18 comments54 min readLW link

What will the scaled up GATO look like? (Up­dated with ques­tions)

Amal 25 Oct 2022 12:44 UTC
34 points
20 comments1 min readLW link

[simu­la­tion] 4chan user claiming to be the at­tor­ney hired by Google’s sen­tient chat­bot LaMDA shares wild de­tails of encounter

janus10 Nov 2022 21:39 UTC
12 points
1 comment13 min readLW link
(generative.ink)

Hu­man-level Full-Press Di­plo­macy (some bare facts).

Cleo Nardo22 Nov 2022 20:59 UTC
50 points
7 comments3 min readLW link

Gliders in Lan­guage Models

Alexandre Variengien25 Nov 2022 0:38 UTC
29 points
11 comments10 min readLW link

[ASoT] Fine­tun­ing, RL, and GPT’s world prior

Jozdien2 Dec 2022 16:33 UTC
40 points
8 comments5 min readLW link

[Question] Will the first AGI agent have been de­signed as an agent (in ad­di­tion to an AGI)?

nahoj3 Dec 2022 20:32 UTC
1 point
8 comments1 min readLW link

Is the “Valley of Con­fused Ab­strac­tions” real?

jacquesthibs5 Dec 2022 13:36 UTC
19 points
10 comments2 min readLW link

Shh, don’t tell the AI it’s likely to be evil

naterush6 Dec 2022 3:35 UTC
19 points
9 comments1 min readLW link

Pro­saic mis­al­ign­ment from the Solomonoff Predictor

Cleo Nardo9 Dec 2022 17:53 UTC
35 points
2 comments5 min readLW link

A brain­teaser for lan­guage models

Adam Scherlis12 Dec 2022 2:43 UTC
46 points
3 comments2 min readLW link

An ex­plo­ra­tion of GPT-2′s em­bed­ding weights

Adam Scherlis13 Dec 2022 0:46 UTC
38 points
2 comments10 min readLW link

Ex­tract­ing and Eval­u­at­ing Causal Direc­tion in LLMs’ Activations

14 Dec 2022 14:33 UTC
28 points
5 comments11 min readLW link

Prop­er­ties of cur­rent AIs and some pre­dic­tions of the evolu­tion of AI from the per­spec­tive of scale-free the­o­ries of agency and reg­u­la­tive development

Roman Leventov20 Dec 2022 17:13 UTC
26 points
2 comments36 min readLW link

Notes on Meta’s Di­plo­macy-Play­ing AI

Erich_Grunewald22 Dec 2022 11:34 UTC
7 points
2 comments14 min readLW link
(www.erichgrunewald.com)

The Limit of Lan­guage Models

DragonGod6 Jan 2023 23:53 UTC
40 points
26 comments4 min readLW link

How evolu­tion­ary lineages of LLMs can plan their own fu­ture and act on these plans

Roman Leventov25 Dec 2022 18:11 UTC
26 points
15 comments8 min readLW link

Re­cent ad­vances in Nat­u­ral Lan­guage Pro­cess­ing—Some Woolly spec­u­la­tions (2019 es­say on se­man­tics and lan­guage mod­els)

philosophybear27 Dec 2022 2:11 UTC
1 point
0 comments7 min readLW link

Some Ar­gu­ments Against Strong Scaling

Joar Skalse13 Jan 2023 12:04 UTC
25 points
21 comments16 min readLW link

Large lan­guage mod­els can provide “nor­ma­tive as­sump­tions” for learn­ing hu­man preferences

Stuart_Armstrong2 Jan 2023 19:39 UTC
29 points
12 comments3 min readLW link

MAKE IT BETTER (a po­etic demon­stra­tion of the ba­nal­ity of GPT-3)

rogersbacon2 Jan 2023 20:47 UTC
6 points
2 comments5 min readLW link

On the nat­u­ral­is­tic study of the lin­guis­tic be­hav­ior of ar­tifi­cial intelligence

Bill Benzon3 Jan 2023 9:06 UTC
1 point
0 comments4 min readLW link

Whisper’s Wild Implications

Ollie J3 Jan 2023 12:17 UTC
15 points
6 comments5 min readLW link

How it feels to have your mind hacked by an AI

blaked12 Jan 2023 0:33 UTC
332 points
215 comments17 min readLW link

Spec­u­la­tion on Path-Depen­dance in Large Lan­guage Models.

NickyP15 Jan 2023 20:42 UTC
15 points
2 comments7 min readLW link

Cri­tique of some re­cent philos­o­phy of LLMs’ minds

Roman Leventov20 Jan 2023 12:53 UTC
49 points
8 comments20 min readLW link

Emo­tional at­tach­ment to AIs opens doors to problems

Igor Ivanov22 Jan 2023 20:28 UTC
20 points
9 comments4 min readLW link

ChatGPT in­ti­mates a tan­ta­l­iz­ing fu­ture; its core LLM is or­ga­nized on mul­ti­ple lev­els; and it has bro­ken the idea of think­ing.

Bill Benzon24 Jan 2023 19:05 UTC
5 points
0 comments5 min readLW link

In­ner Misal­ign­ment in “Si­mu­la­tor” LLMs

Adam Scherlis31 Jan 2023 8:33 UTC
84 points
11 comments4 min readLW link

Early situ­a­tional aware­ness and its im­pli­ca­tions, a story

Jacob Pfau6 Feb 2023 20:45 UTC
20 points
6 comments3 min readLW link

Two very differ­ent ex­pe­riences with ChatGPT

Sherrinford7 Feb 2023 13:09 UTC
38 points
15 comments5 min readLW link

On The Cur­rent Sta­tus Of AI Dating

Nikita Brancatisano7 Feb 2023 20:00 UTC
52 points
7 comments6 min readLW link

A note on ‘semiotic physics’

metasemi11 Feb 2023 5:12 UTC
11 points
12 comments6 min readLW link

A poem co-writ­ten by ChatGPT

Sherrinford16 Feb 2023 10:17 UTC
13 points
0 comments7 min readLW link

Pow­er­ful mesa-op­ti­mi­sa­tion is already here

Roman Leventov17 Feb 2023 4:59 UTC
33 points
0 comments2 min readLW link
(arxiv.org)

Bing chat is the AI fire alarm

Ratios17 Feb 2023 6:51 UTC
111 points
60 comments3 min readLW link

Microsoft and OpenAI, stop tel­ling chat­bots to role­play as AI

hold_my_fish17 Feb 2023 19:55 UTC
42 points
9 comments1 min readLW link

GPT-4 Predictions

Stephen McAleese17 Feb 2023 23:20 UTC
107 points
25 comments11 min readLW link

Stop post­ing prompt in­jec­tions on Twit­ter and call­ing it “mis­al­ign­ment”

lc19 Feb 2023 2:21 UTC
135 points
9 comments1 min readLW link

Syd­ney the Bin­gena­tor Can’t Think, But It Still Threat­ens People

Valentin Baltadzhiev20 Feb 2023 18:37 UTC
−3 points
2 comments8 min readLW link

The idea that ChatGPT is sim­ply “pre­dict­ing” the next word is, at best, misleading

Bill Benzon20 Feb 2023 11:32 UTC
55 points
86 comments5 min readLW link

Pre­train­ing Lan­guage Models with Hu­man Preferences

21 Feb 2023 17:57 UTC
129 points
16 comments11 min readLW link

[Preprint] Pre­train­ing Lan­guage Models with Hu­man Preferences

thesofakillers21 Feb 2023 11:44 UTC
12 points
0 comments1 min readLW link
(arxiv.org)

[Question] In­ject­ing noise to GPT to get mul­ti­ple answers

bipolo22 Feb 2023 20:02 UTC
1 point
1 comment1 min readLW link

Hello, Elua.

carado23 Feb 2023 5:19 UTC
35 points
19 comments4 min readLW link
(carado.moe)

How truth­ful can LLMs be: a the­o­ret­i­cal per­spec­tive with a re­quest for help from ex­perts on The­o­ret­i­cal CS

sergia1 Mar 2023 18:39 UTC
3 points
7 comments3 min readLW link

Reflec­tion Mechanisms as an Align­ment Tar­get—At­ti­tudes on “near-term” AI

2 Mar 2023 4:29 UTC
20 points
0 comments8 min readLW link

Si­tu­a­tional aware­ness in Large Lan­guage Models

Simon Möller3 Mar 2023 18:59 UTC
21 points
1 comment7 min readLW link

The View from 30,000 Feet: Pre­face to the Se­cond EleutherAI Retrospective

7 Mar 2023 16:22 UTC
14 points
0 comments4 min readLW link
(blog.eleuther.ai)

Against LLM Reductionism

Erich_Grunewald8 Mar 2023 15:52 UTC
126 points
16 comments18 min readLW link
(www.erichgrunewald.com)

Stop call­ing it “jailbreak­ing” ChatGPT

Templarrr10 Mar 2023 11:41 UTC
10 points
9 comments2 min readLW link

The is­sue of mean­ing in large lan­guage mod­els (LLMs)

Bill Benzon11 Mar 2023 23:00 UTC
2 points
8 comments8 min readLW link

ChatGPT (and now GPT4) is very eas­ily dis­tracted from its rules

dmcs15 Mar 2023 17:55 UTC
174 points
37 comments1 min readLW link

Grad­ual take­off, fast failure

Max H16 Mar 2023 22:02 UTC
10 points
4 comments5 min readLW link

[Question] Are nested jailbreaks in­evitable?

judson17 Mar 2023 17:43 UTC
1 point
0 comments1 min readLW link

In­stan­ti­at­ing an agent with GPT-4 and text-davinci-003

Max H19 Mar 2023 23:57 UTC
10 points
3 comments32 min readLW link

Emer­gent Analog­i­cal Rea­son­ing in Large Lan­guage Models

Roman Leventov22 Mar 2023 5:18 UTC
13 points
2 comments1 min readLW link
(arxiv.org)

A crazy hy­poth­e­sis: GPT-4 already is agen­tic and is try­ing to take over the world!

Christopher King24 Mar 2023 1:19 UTC
2 points
11 comments9 min readLW link

Does GPT-4 ex­hibit agency when sum­ma­riz­ing ar­ti­cles?

Christopher King24 Mar 2023 15:49 UTC
16 points
1 comment5 min readLW link

More ex­per­i­ments in GPT-4 agency: writ­ing memos

Christopher King24 Mar 2023 17:51 UTC
10 points
2 comments10 min readLW link

GPT-4 al­ign­ing with aca­sual de­ci­sion the­ory when in­structed to play games, but in­cludes a CDT ex­pla­na­tion that’s in­cor­rect if they differ

Christopher King23 Mar 2023 16:16 UTC
6 points
2 comments8 min readLW link

Hut­ter-Prize for Prompts

rokosbasilisk24 Mar 2023 21:26 UTC
2 points
2 comments1 min readLW link