GPT

TagLast edit: Feb 19, 2023, 2:36 AM by Multicore

GPT (Generative Pretrained Transformer) is a family of large transformer-based language models created by OpenAI. Its ability to generate remarkably human-like responses has relevance to discussions on AGI.

External links:

GPT-3 Paper

GPT-3 Website

Collection of GPT-3 results

Kaj_SotalaJul 18, 2020, 8:04 PM

89 points

24 comments1 min readLW link

(twitter.com)

[Question] To what extent is GPT-3 capable of reasoning?

TurnTroutJul 20, 2020, 5:10 PM

70 points

73 comments16 min readLW link

GPT-3: a disappointing paper

nostalgebraistMay 29, 2020, 7:06 PM

65 points

43 comments8 min readLW link 1 review

$1000 bounty for OpenAI to show whether GPT3 was “deliberately” pretending to be stupider than it is

Bird ConceptJul 21, 2020, 6:42 PM

56 points

39 comments2 min readLW link

(twitter.com)

Two Small Experiments on GPT-2

jimrandomhFeb 21, 2019, 2:59 AM

54 points

28 comments1 min readLW link

GPT-3 Fiction Samples

gwernJun 25, 2020, 4:12 PM

63 points

15 comments1 min readLW link

(www.gwern.net)

Does GPT-2 Understand Anything?

Douglas Summers-StayJan 2, 2020, 5:09 PM

37 points

23 comments5 min readLW link

‘This Waifu Does Not Exist’: 100,000 StyleGAN & GPT-2 samples

gwernMar 1, 2019, 4:29 AM

39 points

6 comments1 min readLW link

(www.thiswaifudoesnotexist.net)

Alignment As A Bottleneck To Usefulness Of GPT-3

johnswentworthJul 21, 2020, 8:02 PM

111 points

57 comments3 min readLW link

345M version GPT-2 released

lifelonglearnerMay 5, 2019, 2:49 AM

37 points

0 comments1 min readLW link

(openai.com)

[Question] How “honest” is GPT-3?

abramdemskiJul 8, 2020, 7:38 PM

72 points

18 comments5 min readLW link

[Question] How well can the GPT architecture solve the parity task?

FactorialCodeJul 11, 2020, 7:02 PM

19 points

3 comments1 min readLW link

Replicating the replication crisis with GPT-3?

skybrianJul 22, 2020, 9:20 PM

29 points

10 comments1 min readLW link

Can you get AGI from a Transformer?

Steven ByrnesJul 23, 2020, 3:27 PM

117 points

40 comments12 min readLW link

larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraistNov 26, 2021, 11:08 PM

260 points

31 comments31 min readLW link 2 reviews

Developmental Stages of GPTs

orthonormalJul 26, 2020, 10:03 PM

140 points

72 comments7 min readLW link 1 review

OpenGPT-2: We Replicated GPT-2 Because You Can Too

avturchinAug 23, 2019, 11:32 AM

18 points

0 comments1 min readLW link

(medium.com)

Humans Who Are Not Concentrating Are Not General Intelligences

sarahconstantinFeb 25, 2019, 8:40 PM

191 points

35 comments6 min readLW link 1 review

(srconstantin.wordpress.com)

Analyzing the Problem GPT-3 is Trying to Solve

adamShimiAug 6, 2020, 9:58 PM

16 points

2 comments4 min readLW link

Are we in an AI overhang?

Andy JonesJul 27, 2020, 12:48 PM

266 points

106 comments4 min readLW link

Writing with GPT-3

Jacob FalkovichJul 24, 2020, 3:22 PM

42 points

0 comments4 min readLW link

[Question] How will internet forums like LW be able to defend against GPT-style spam?

ChristianKlJul 28, 2020, 8:12 PM

14 points

17 comments1 min readLW link

GPT-3, belief, and consistency

skybrianAug 16, 2020, 11:12 PM

18 points

7 comments2 min readLW link

The Hacker Learns to Trust

Ben PaceJun 22, 2019, 12:27 AM

80 points

18 comments8 min readLW link

(medium.com)

GPT-2: 6-Month Follow-Up

lifelonglearnerAug 21, 2019, 5:06 AM

28 points

1 comment1 min readLW link

How LLMs are and are not myopic

janusJul 25, 2023, 2:19 AM

135 points

16 comments8 min readLW link

the scaling “inconsistency”: openAI’s new insight

nostalgebraistNov 7, 2020, 7:40 AM

148 points

14 comments9 min readLW link

(nostalgebraist.tumblr.com)

[Question] Will OpenAI’s work unintentionally increase existential risks related to AI?

adamShimiAug 11, 2020, 6:16 PM

53 points

55 comments1 min readLW link

Hiring engineers and researchers to help align GPT-3

paulfchristianoOct 1, 2020, 6:54 PM

206 points

13 comments3 min readLW link

Using GPT-N to Solve Interpretability of Neural Networks: A Research Agenda

Logan Riggs and Gurkenglas

Sep 3, 2020, 6:27 PM

68 points

11 comments2 min readLW link

[Question] If GPT-6 is human-level AGI but costs $200 per page of output, what would happen?

Daniel KokotajloOct 9, 2020, 12:00 PM

29 points

30 comments1 min readLW link

OpenAI announces GPT-3

gwernMay 29, 2020, 1:49 AM

67 points

23 comments1 min readLW link

(arxiv.org)

Image GPT

Daniel KokotajloJun 18, 2020, 11:41 AM

29 points

27 comments1 min readLW link

(openai.com)

Extrapolating GPT-N performance

Lukas FinnvedenDec 18, 2020, 9:41 PM

112 points

31 comments22 min readLW link 1 review

Scaffolded LLMs as natural language computers

berenApr 12, 2023, 10:47 AM

95 points

10 comments11 min readLW link

[ASoT] Finetuning, RL, and GPT’s world prior

JozdienDec 2, 2022, 4:33 PM

45 points

8 comments5 min readLW link

[AN #102]: Meta learning by GPT-3, and a list of full proposals for AI alignment

Rohin ShahJun 3, 2020, 5:20 PM

38 points

6 comments10 min readLW link

(mailchi.mp)

GPT-4 Plugs In

ZviMar 27, 2023, 12:10 PM

198 points

47 comments6 min readLW link

(thezvi.wordpress.com)

[April Fools] User GPT2 is Banned

jimrandomhApr 2, 2019, 6:00 AM

65 points

20 comments1 min readLW link

interpreting GPT: the logit lens

nostalgebraistAug 31, 2020, 2:47 AM

230 points

38 comments10 min readLW link

Simulators

janusSep 2, 2022, 12:45 PM

633 points

168 comments41 min readLW link 8 reviews

(generative.ink)

GPT-4 Predictions

Stephen McAleeseFeb 17, 2023, 11:20 PM

110 points

27 comments11 min readLW link

is gpt-3 few-shot ready for real applications?

nostalgebraistAug 3, 2020, 7:50 PM

31 points

5 comments9 min readLW link

(nostalgebraist.tumblr.com)

Cyborgism

NicholasKees and janus

Feb 10, 2023, 2:47 PM

332 points

46 comments35 min readLW link 2 reviews

Autoregressive Propaganda

lsusrAug 22, 2021, 2:18 AM

25 points

3 comments3 min readLW link

DALL-E by OpenAI

Daniel KokotajloJan 5, 2021, 8:05 PM

97 points

20 comments1 min readLW link

Can submarines swim?

jasoncrawfordFeb 22, 2023, 6:48 PM

18 points

14 comments13 min readLW link

(rootsofprogress.org)

Predictions for GPT-N

hippkeJul 29, 2020, 1:16 AM

36 points

31 comments1 min readLW link

I wanted to interview Eliezer Yudkowsky but he’s busy so I simulated him instead

lsusrSep 16, 2021, 7:34 AM

113 points

33 comments5 min readLW link

[Question] GPT-4 and ASCII Images?

carterallenMar 19, 2023, 3:46 PM

10 points

17 comments1 min readLW link

[Question] If AI is based on GPT, how to ensure its safety?

avturchinJun 18, 2020, 8:33 PM

20 points

11 comments1 min readLW link

The ‘ petertodd’ phenomenon

mwatkinsApr 15, 2023, 12:59 AM

192 points

50 comments38 min readLW link 1 review

GPT-4 can catch subtle cross-language translation mistakes

Michael TontchevJul 27, 2023, 1:39 AM

7 points

1 comment1 min readLW link

GPT-3 and concept extrapolation

Stuart_ArmstrongApr 20, 2022, 10:39 AM

19 points

27 comments1 min readLW link

Show, not tell: GPT-4o is more opinionated in images than in text

Daniel Tan and eggsyntax

Apr 2, 2025, 8:51 AM

103 points

41 comments3 min readLW link

Sparks of Artificial General Intelligence: Early experiments with GPT-4 | Microsoft Research

DragonGodMar 23, 2023, 5:45 AM

68 points

23 comments1 min readLW link

(arxiv.org)

Paper: On measuring situational awareness in LLMs

Owain_Evans, Daniel Kokotajlo, Mikita Balesni, Tomek Korbak, Asa Cooper Stickland, Meg and Maximilian Kaufmann

Sep 4, 2023, 12:54 PM

109 points

16 comments5 min readLW link

(arxiv.org)

PaLM-2 & GPT-4 in “Extrapolating GPT-N performance”

Lukas FinnvedenMay 30, 2023, 6:33 PM

57 points

6 comments6 min readLW link

Remarks 1–18 on GPT (compressed)

Cleo NardoMar 20, 2023, 10:27 PM

145 points

35 comments31 min readLW link

Exploring the petertodd / Leilan duality in GPT-2 and GPT-J

mwatkinsDec 23, 2024, 1:17 PM

12 points

1 comment17 min readLW link

Trying out Prompt Engineering on TruthfulQA

Megan KinnimentJul 23, 2022, 2:04 AM

10 points

0 comments8 min readLW link

[Question] ChatGTP “Writing ” News Stories for The Guardian?

jmhApr 7, 2023, 12:16 PM

1 point

4 comments1 min readLW link

Steering GPT-2-XL by adding an activation vector

TurnTrout, Monte M, David Udell, lisathiergart and Ulisse Mini

May 13, 2023, 6:42 PM

437 points

98 comments50 min readLW link 1 review

[Question] List of public predictions of what GPT-X can or can’t do?

Daniel KokotajloJun 14, 2020, 2:25 PM

20 points

9 comments1 min readLW link

Analysis of GPT-4 competence in assessing complex legal language: Example of Bill C-11 of the Canadian Parliament. - Part 1

M. Y. ZuoApr 2, 2023, 12:01 AM

12 points

2 comments14 min readLW link

Why Simulator AIs want to be Active Inference AIs

Jan_Kulveit and rosehadshar

Apr 10, 2023, 6:23 PM

95 points

9 comments8 min readLW link 1 review

The idea that ChatGPT is simply “predicting” the next word is, at best, misleading

Bill BenzonFeb 20, 2023, 11:32 AM

55 points

88 comments5 min readLW link

[Question] Will 2023 be the last year you can write short stories and receive most of the intellectual credit for writing them?

lcMar 16, 2023, 9:36 PM

20 points

11 comments1 min readLW link

On GPT-4.5

ZviMar 3, 2025, 1:40 PM

44 points

12 comments22 min readLW link

(thezvi.wordpress.com)

[LINK] - ChatGPT discussion

JanBDec 1, 2022, 3:04 PM

13 points

8 comments1 min readLW link

(openai.com)

[Question] Is GPT-3 already sample-efficient?

Daniel KokotajloOct 6, 2021, 1:38 PM

36 points

32 comments1 min readLW link

[Question] Why is o1 so deceptive?

abramdemskiSep 27, 2024, 5:27 PM

180 points

24 comments3 min readLW link

Exploring GPT4′s world model

hippkeMar 20, 2023, 9:31 PM

−5 points

5 comments2 min readLW link

On OpenAI Dev Day

ZviNov 9, 2023, 4:10 PM

60 points

0 comments15 min readLW link

(thezvi.wordpress.com)

A simple way to make GPT-3 follow instructions

Quintin PopeMar 8, 2021, 2:57 AM

11 points

5 comments4 min readLW link

NVIDIA and Microsoft releases 530B parameter transformer model, Megatron-Turing NLG

OzyrusOct 11, 2021, 3:28 PM

51 points

36 comments1 min readLW link

(developer.nvidia.com)

Reader-generated Essays

Henrik KarlssonJan 3, 2022, 8:56 AM

25 points

1 comment6 min readLW link

(escapingflatland.substack.com)

Getting 50% (SoTA) on ARC-AGI with GPT-4o

ryan_greenblattJun 17, 2024, 6:44 PM

263 points

50 comments13 min readLW link

Linear encoding of character-level information in GPT-J token embeddings

mwatkins and Joseph Bloom

Nov 10, 2023, 10:19 PM

34 points

4 comments28 min readLW link

Agentic GPT simulations: a risk and an opportunity

Yair HalberstadtMar 22, 2023, 6:24 AM

24 points

8 comments1 min readLW link

Steering Behaviour: Testing for (Non-)Myopia in Language Models

Evan R. Murphy and Megan Kinniment

Dec 5, 2022, 8:28 PM

40 points

19 comments10 min readLW link

Do Not Mess With Scarlett Johansson

ZviMay 22, 2024, 3:10 PM

65 points

7 comments16 min readLW link

(thezvi.wordpress.com)

OpenAI releases GPT-4o, natively interfacing with text, voice and vision

Martín SotoMay 13, 2024, 6:50 PM

54 points

23 comments1 min readLW link

(openai.com)

[Question] What should we expect from GPT-3?

avturchinMar 21, 2019, 2:28 PM

22 points

2 comments1 min readLW link

What’s the Least Impressive Thing GPT-4 Won’t be Able to Do

AlgonAug 20, 2022, 7:48 PM

80 points

125 comments1 min readLW link

[ASoT] Thoughts on GPT-N

Ulisse MiniNov 8, 2022, 7:14 AM

8 points

0 comments1 min readLW link

Why did ChatGPT say that? Prompt engineering and more, with PIZZA.

Jessica RumbelowAug 3, 2024, 12:07 PM

41 points

2 comments4 min readLW link

HDBSCAN is Surprisingly Effective at Finding Interpretable Clusters of the SAE Decoder Matrix

Jaehyuk Lim, Kanishk Tantia and Sinem

Oct 11, 2024, 11:06 PM

8 points

2 comments10 min readLW link

[Question] How hard would it be to change GPT-3 in a way that allows audio?

ChristianKlAug 28, 2020, 2:42 PM

9 points

5 comments1 min readLW link

GPT-Augmented Blogging

lsusrSep 14, 2021, 11:55 AM

52 points

18 comments13 min readLW link

On AutoGPT

ZviApr 13, 2023, 12:30 PM

248 points

47 comments20 min readLW link

(thezvi.wordpress.com)

Examples of Prompts that Make GPT-4 Output Falsehoods

scasper and Luke Bailey

Jul 22, 2023, 8:21 PM

21 points

5 comments6 min readLW link

AI #4: Introducing GPT-4

ZviMar 21, 2023, 2:00 PM

101 points

32 comments103 min readLW link

(thezvi.wordpress.com)

OpenAI: GPT-based LLMs show ability to discriminate between its own wrong answers, but inability to explain how/why it makes that discrimination, even as model scales

Aditya JainJun 13, 2022, 11:33 PM

14 points

5 comments1 min readLW link

(openai.com)

GPT as an “Intelligence Forklift.”

boazbarakMay 19, 2023, 9:15 PM

49 points

27 comments3 min readLW link

Simulated Elon Musk Lives in a Simulation

lsusrSep 18, 2021, 7:37 AM

66 points

13 comments3 min readLW link

Evaluating GPT-4 Theory of Mind Capabilities

gcmac and Nathan

Aug 10, 2023, 5:57 PM

15 points

2 comments14 min readLW link

Can ChatGPT count?

p.b.Jan 7, 2023, 7:57 AM

13 points

11 comments2 min readLW link

ChatGPT struggles to respond to the real world

Alex FlintJan 12, 2023, 4:02 PM

31 points

9 comments24 min readLW link

Stanford claims to have replicated ChatGPT for < $600

NoSignalNoNoiseMar 21, 2023, 2:28 AM

2 points

1 comment1 min readLW link

(crfm.stanford.edu)

[Question] Is the ChatGPT-simulated Linux virtual machine real?

KenoubiDec 13, 2022, 3:41 PM

18 points

7 comments1 min readLW link

The “AI Dungeons” Dragon Model is heavily path dependent (testing GPT-3 on ethics)

Rafael HarthJul 21, 2020, 12:14 PM

44 points

9 comments6 min readLW link

GPT-4 developer livestream

Gerald MonroeMar 14, 2023, 8:55 PM

9 points

0 comments1 min readLW link

(www.youtube.com)

The Power of High Speed Stupidity

robotelvisMar 17, 2023, 9:41 PM

33 points

6 comments9 min readLW link 1 review

(messyprogress.substack.com)

Engaging Seriously with Short Timelines

sapphireJul 29, 2020, 7:21 PM

43 points

21 comments3 min readLW link

GPTs are Predictors, not Imitators

Eliezer YudkowskyApr 8, 2023, 7:59 PM

416 points

100 comments3 min readLW link 3 reviews

[Question] Does GPT-4′s ability to compress text in a way that it can actually decompress indicate self-awareness?

FinalFormal2Apr 10, 2023, 4:48 PM

3 points

2 comments1 min readLW link

More Fun With GPT-4o Image Generation

ZviApr 3, 2025, 2:10 AM

34 points

3 comments8 min readLW link

(thezvi.wordpress.com)

GPT can write Quines now (GPT-4)

Andrew_CritchMar 14, 2023, 7:18 PM

112 points

30 comments1 min readLW link

[Question] Any writeups on GPT agency?

OzyrusSep 26, 2021, 10:55 PM

4 points

6 comments1 min readLW link

OpenAI’s GPT-4 Safety Goals

PeterMcCluskeyApr 22, 2023, 7:11 PM

3 points

3 comments4 min readLW link

(bayesianinvestor.com)

Navigating LLM embedding spaces using archetype-based directions

mwatkinsMay 8, 2024, 5:54 AM

15 points

4 comments28 min readLW link

Jailbreaking ChatGPT on Release Day

ZviDec 2, 2022, 1:10 PM

242 points

77 comments6 min readLW link 1 review

(thezvi.wordpress.com)

Experiments in Evaluating Steering Vectors

Gytis DaujotasJun 19, 2023, 3:11 PM

34 points

4 comments4 min readLW link

Positive outcomes under an unaligned AGI takeover

YitzMay 12, 2022, 7:45 AM

19 points

10 comments3 min readLW link

An explanation for every token: using an LLM to sample another LLM

Max HOct 11, 2023, 12:53 AM

35 points

5 comments11 min readLW link

[Question] Where is human level on text prediction? (GPTs task)

Daniel KokotajloSep 20, 2020, 9:00 AM

27 points

19 comments1 min readLW link

The Cave Allegory Revisited: Understanding GPT’s Worldview

Jan_KulveitFeb 14, 2023, 4:00 PM

86 points

5 comments3 min readLW link

Humans pretending to be robots pretending to be human

Richard_KennawayMar 28, 2022, 3:13 PM

25 points

14 comments1 min readLW link

ARC tests to see if GPT-4 can escape human control; GPT-4 failed to do so

Christopher KingMar 15, 2023, 12:29 AM

116 points

22 comments2 min readLW link

Progress Report 7: making GPT go hurrdurr instead of brrrrrrr

Nathan Helm-BurgerSep 7, 2022, 3:28 AM

21 points

0 comments4 min readLW link

Studying The Alien Mind

Quentin FEUILLADE--MONTIXI and NicholasKees

Dec 5, 2023, 5:27 PM

80 points

10 comments15 min readLW link

A chess game against GPT-4

Rafael HarthMar 16, 2023, 2:05 PM

24 points

23 comments1 min readLW link

Trivial GPT-3.5 limitation workaround

Dave LindberghDec 12, 2022, 8:42 AM

5 points

4 comments1 min readLW link

Feature proposal: integrate LessWrong with ChatGPT to promote active reading

DirectedEvolutionMar 19, 2023, 3:41 AM

10 points

4 comments1 min readLW link

AI-Based Code Generation Using GPT-J-6B

Tomás B.Jun 16, 2021, 3:05 PM

22 points

14 comments1 min readLW link

(minimaxir.com)

What’s Your Cognitive Algorithm?

RaemonJun 18, 2020, 10:16 PM

75 points

23 comments13 min readLW link

GPT-3: A Summary

leogaoJun 2, 2020, 6:14 PM

20 points

0 comments1 min readLW link

(leogao.dev)

More GPT-3 and symbol grounding

Stuart_ArmstrongFeb 23, 2022, 6:30 PM

21 points

7 comments3 min readLW link

[Question] Question on GPT-3 Excel Demo

Zhitao HouJun 22, 2020, 8:31 PM

0 points

1 comment1 min readLW link

The Colliding Exponentials of AI

VermillionOct 14, 2020, 11:31 PM

28 points

16 comments5 min readLW link

[Question] How much should you be willing to pay for an AGI?

Logan ZoellnerSep 20, 2021, 11:51 AM

11 points

5 comments1 min readLW link

Thoughts on the Alignment Implications of Scaling Language Models

leogaoJun 2, 2021, 9:32 PM

82 points

11 comments17 min readLW link

ChatGPT and Bing Chat can’t play Botticelli

Asha SaavossMar 29, 2023, 5:39 PM

11 points

0 comments6 min readLW link

Language Models are a Potentially Safe Path to Human-Level AGI

Nadav BrandesApr 20, 2023, 12:40 AM

28 points

7 comments8 min readLW link 1 review

New Scaling Laws for Large Language Models

1a3ornApr 1, 2022, 8:41 PM

246 points

22 comments5 min readLW link

[Question] Probability that other architectures will scale as well as Transformers?

Daniel KokotajloJul 28, 2020, 7:36 PM

22 points

4 comments1 min readLW link

What’s up with all the non-Mormons? Weirdly specific universalities across LLMs

mwatkinsApr 19, 2024, 1:43 PM

40 points

13 comments27 min readLW link

[Question] AI misalignment risk from GPT-like systems?

fiso64Jun 19, 2022, 5:35 PM

10 points

8 comments1 min readLW link

Arguing all sides with ChatGPT

Richard_KennawayMar 30, 2023, 7:50 PM

16 points

1 comment8 min readLW link

Is it a bad idea to pay for GPT-4?

nemMar 16, 2023, 8:49 PM

24 points

8 comments1 min readLW link

Beyond 175 billion parameters: Can we anticipate future GPT-X Capabilities?

bakztfutureDec 4, 2020, 11:42 PM

−1 points

1 comment2 min readLW link

MIRI comments on Cotra’s “Case for Aligning Narrowly Superhuman Models”

Rob BensingerMar 5, 2021, 11:43 PM

142 points

13 comments26 min readLW link

[Question] GPT learning from smarter texts?

ViliamJan 8, 2023, 10:23 PM

26 points

7 comments1 min readLW link

Could an AI be Religious?

mk54Dec 4, 2022, 5:00 AM

−12 points

14 comments1 min readLW link

Creating a family with GPT-4

Kaj_SotalaMar 28, 2023, 6:40 AM

23 points

3 comments10 min readLW link

(kajsotala.fi)

[updated] how does gpt2′s training corpus capture internet discussion? not well

nostalgebraistJul 27, 2020, 10:30 PM

25 points

3 comments2 min readLW link

(nostalgebraist.tumblr.com)

Using GPT-3 to augment human intelligence

Henrik KarlssonAug 10, 2022, 3:54 PM

52 points

8 comments18 min readLW link

(escapingflatland.substack.com)

Personal imitation software

FlaglandbaseMar 7, 2022, 7:55 AM

6 points

6 comments1 min readLW link

GPT-3 Gems

TurnTroutJul 23, 2020, 12:46 AM

33 points

10 comments48 min readLW link

GPT-4 for personal productivity: online distraction blocker

SergiiSep 26, 2023, 5:41 PM

65 points

13 comments2 min readLW link

(grgv.xyz)

[Question] Can GPT-4 play 20 questions against another instance of itself?

Nathan Helm-BurgerMar 28, 2023, 1:11 AM

15 points

1 comment1 min readLW link

(evanthebouncy.medium.com)

GPT-4 Multiplication Competition

dandelion4Mar 16, 2023, 3:09 AM

11 points

7 comments1 min readLW link

New GPT-3 competitor

Quintin PopeAug 12, 2021, 7:05 AM

32 points

10 comments1 min readLW link

Anomalous tokens reveal the original identities of Instruct models

janus and jdp

Feb 9, 2023, 1:30 AM

140 points

16 comments9 min readLW link

(generative.ink)

Idea: build alignment dataset for very capable models

Quintin PopeFeb 12, 2022, 7:30 PM

14 points

2 comments3 min readLW link

Microsoft Research Paper Claims Sparks of Artificial Intelligence in GPT-4

ZviMar 24, 2023, 1:20 PM

72 points

14 comments6 min readLW link

(thezvi.wordpress.com)

Next Level Seinfeld

ZviDec 19, 2022, 1:30 PM

50 points

8 comments1 min readLW link

(thezvi.wordpress.com)

GPT-4o My and Google I/O Day

ZviMay 16, 2024, 5:50 PM

41 points

2 comments37 min readLW link

(thezvi.wordpress.com)

Storytelling Makes GPT-3.5 Deontologist: Unexpected Effects of Context on LLM Behavior

Edmund Mills and Scott Emmons

Mar 14, 2023, 8:44 AM

17 points

0 comments12 min readLW link

GTP4 capable of limited recursive improving?

Boris KashirinApr 2, 2023, 9:38 PM

2 points

3 comments1 min readLW link

[Question] Is InstructGPT Following Instructions in Other Languages Surprising?

DragonGodFeb 13, 2023, 11:26 PM

39 points

15 comments1 min readLW link

A one-question Turing test for GPT-3

Paul Crowley and rosiecam

Jan 22, 2022, 6:17 PM

85 points

25 comments5 min readLW link

[Question] To what extent are the scaling properties of Transformer networks exceptional?

abramdemskiJul 28, 2020, 8:06 PM

30 points

1 comment1 min readLW link

[Question] What did you do with GPT4?

ChristianKlMar 18, 2023, 3:21 PM

27 points

17 comments1 min readLW link

A crisis for online communication: bots and bot users will overrun the Internet?

Mitchell_PorterDec 11, 2022, 9:11 PM

15 points

11 comments1 min readLW link

[Question] Are we certain that gpt-2 and similar algorithms are not self-aware?

OzyrusJul 11, 2019, 8:37 AM

0 points

12 comments1 min readLW link

Getting GPT-3 to predict Metaculus questions

MathiasKBMay 6, 2022, 6:01 AM

69 points

9 comments2 min readLW link

New GPT3 Impressive Capabilities—InstructGPT3 [1/2]

simeon_cMar 13, 2022, 10:58 AM

72 points

10 comments7 min readLW link

Can GPT-3 Write Contra Dances?

jefftkDec 4, 2022, 3:00 AM

6 points

4 comments10 min readLW link

(www.jefftk.com)

Large language models learn to represent the world

gjmJan 22, 2023, 1:10 PM

101 points

20 comments3 min readLW link 1 review

BIG-Bench Canary Contamination in GPT-4

JozdienOct 22, 2024, 3:40 PM

125 points

14 comments4 min readLW link

Evaluating strategic reasoning in GPT models

phelps-sgMay 25, 2023, 11:51 AM

4 points

1 comment8 min readLW link

ChatGPT: First Impressions

specbugDec 1, 2022, 4:36 PM

18 points

2 comments13 min readLW link

(sixeleven.in)

Why GPT wants to mesa-optimize & how we might change this

John_MaxwellSep 19, 2020, 1:48 PM

55 points

33 comments9 min readLW link

Paper: Teaching GPT3 to express uncertainty in words

Owain_EvansMay 31, 2022, 1:27 PM

97 points

7 comments4 min readLW link

Study 1b: This One Weird Trick does NOT cause incorrectness cascades

Robert_AIZIApr 20, 2023, 6:10 PM

5 points

0 comments6 min readLW link

(aizi.substack.com)

Mapping the semantic void: Strange goings-on in GPT embedding spaces

mwatkinsDec 14, 2023, 1:10 PM

114 points

31 comments14 min readLW link

′ petertodd’’s last stand: The final days of open GPT-3 research

mwatkinsJan 22, 2024, 6:47 PM

109 points

16 comments45 min readLW link

Bad at Arithmetic, Promising at Math

cohenmacaulayDec 18, 2022, 5:40 AM

100 points

19 comments20 min readLW link 1 review

Mlyyrczo

lsusrDec 26, 2022, 7:58 AM

41 points

14 comments3 min readLW link

“Summarizing Books with Human Feedback” (recursive GPT-3)

gwernNov 15, 2021, 5:41 PM

24 points

4 comments1 min readLW link

(openai.com)

[Link] Training Compute-Optimal Large Language Models

nostalgebraistMar 31, 2022, 6:01 PM

51 points

23 comments1 min readLW link

(arxiv.org)

Beta test GPT-3 based research assistant

jungofthewonDec 16, 2020, 1:42 PM

34 points

2 comments1 min readLW link

Researchers and writers can apply for proxy access to the GPT-3.5 base model (code-davinci-002)

ampdotDec 1, 2023, 6:48 PM

14 points

0 comments1 min readLW link

(airtable.com)

Just How Hard a Problem is Alignment?

Roger DearnaleyFeb 25, 2023, 9:00 AM

3 points

1 comment21 min readLW link

A possible check against motivated reasoning using elicit.org

david reinsteinMay 18, 2022, 8:52 PM

3 points

0 comments1 min readLW link

From GPT to AGI

ChristianKlAug 31, 2020, 1:28 PM

6 points

7 comments1 min readLW link

An alternative of PPO towards alignment

ml hkustApr 17, 2023, 5:58 PM

2 points

2 comments4 min readLW link

Exploring the Residual Stream of Transformers for Mechanistic Interpretability — Explained

Zeping YuDec 26, 2023, 12:36 AM

7 points

1 comment11 min readLW link

Mechanistically interpreting time in GPT-2 small

rgould, Elizabeth Ho and Arthur Conmy

Apr 16, 2023, 5:57 PM

68 points

6 comments21 min readLW link

An exploration of GPT-2′s embedding weights

Adam ScherlisDec 13, 2022, 12:46 AM

44 points

4 comments10 min readLW link

human psycholinguists: a critical appraisal

nostalgebraistDec 31, 2019, 12:20 AM

182 points

59 comments16 min readLW link 2 reviews

(nostalgebraist.tumblr.com)

Feelings, Nothing More than Feelings, About AI

PaulBeconNov 14, 2023, 6:50 PM

7 points

0 comments3 min readLW link

GPTs’ ability to keep a secret is weirdly prompt-dependent

Mateusz Bagiński, Filip Sondej and Marcel Windys

Jul 22, 2023, 12:21 PM

31 points

0 comments9 min readLW link

Implementing activation steering

AnnahFeb 5, 2024, 5:51 PM

75 points

8 comments7 min readLW link

Early Results: Do LLMs complete false equations with false equations?

Robert_AIZIMar 30, 2023, 8:14 PM

14 points

0 comments4 min readLW link

(aizi.substack.com)

OpenAI introduces function calling for GPT-4

mic and André Ferretti

Jun 20, 2023, 1:58 AM

24 points

3 comments4 min readLW link

(openai.com)

The Voice Continued Because It Was Questioned

KiyoshiSasanoApr 28, 2025, 12:18 AM

1 point

0 comments2 min readLW link

ChatGPT: “An error occurred. If this issue persists...”

Bill BenzonDec 7, 2022, 3:41 PM

5 points

11 comments3 min readLW link

ChatGPT understands, but largely does not generate Spanglish (and other code-mixed) text

Milan WDec 23, 2022, 5:40 PM

15 points

5 comments4 min readLW link

Pretraining Language Models with Human Preferences

Tomek Korbak, Sam Bowman and Ethan Perez

Feb 21, 2023, 5:57 PM

135 points

20 comments11 min readLW link 2 reviews

ChatGPT goes through a wormhole hole in our Shandyesque universe [virtual wacky weed]

Bill BenzonDec 11, 2022, 11:59 AM

−1 points

2 comments3 min readLW link

[simulation] 4chan user claiming to be the attorney hired by Google’s sentient chatbot LaMDA shares wild details of encounter

janusNov 10, 2022, 9:39 PM

19 points

1 comment13 min readLW link

(generative.ink)

GPT-4 is bad at strategic thinking

Christopher KingMar 27, 2023, 3:11 PM

22 points

8 comments1 min readLW link

Research Report: Incorrectness Cascades

Robert_AIZIApr 14, 2023, 12:49 PM

19 points

0 comments10 min readLW link

(aizi.substack.com)

OpenAI Credit Account (2510$)

Emirhan BULUTJan 21, 2024, 2:32 AM

1 point

0 comments1 min readLW link

GPT-4: What we (I) know about it

Robert_AIZIMar 15, 2023, 8:12 PM

40 points

29 comments12 min readLW link

(aizi.substack.com)

Speculations against GPT-n writing alignment papers

Donald HobsonJun 7, 2021, 9:13 PM

31 points

6 comments2 min readLW link

How I’m thinking about GPT-N

delton137Jan 17, 2022, 5:11 PM

54 points

21 comments18 min readLW link

How I Learned to Stop Worrying and Love MUM

WaddingtonMay 20, 2021, 7:57 AM

2 points

0 comments3 min readLW link

ChatGPT seems overconfident to me

qbolecDec 4, 2022, 8:03 AM

19 points

3 comments16 min readLW link

[Question] If you lose enough Good Heart Tokens, will you lose real-world money?

YitzApr 1, 2022, 9:11 PM

9 points

0 comments1 min readLW link

Testing Ways to Bypass ChatGPT’s Safety Features

Robert_AIZIDec 5, 2022, 6:50 PM

7 points

4 comments5 min readLW link

(aizi.substack.com)

Inching “Kubla Khan” and GPT into the same intellectual framework @ 3 Quarks Daily

Bill BenzonMar 28, 2023, 7:50 PM

5 points

0 comments3 min readLW link

A Novel Emergence of Meta-Awareness in LLM Fine-Tuning

rifeJan 15, 2025, 10:59 PM

57 points

31 comments2 min readLW link

Testing PaLM prompts on GPT3

YitzApr 6, 2022, 5:21 AM

103 points

14 comments8 min readLW link

ChatGPT tells stories, and a note about reverse engineering: A Working Paper

Bill BenzonMar 3, 2023, 3:12 PM

3 points

0 comments3 min readLW link

Research Report: Incorrectness Cascades (Corrected)

Robert_AIZIMay 9, 2023, 9:54 PM

9 points

0 comments9 min readLW link

(aizi.substack.com)

Planning in LLMs: Insights from AlphaGo

jcoDec 4, 2023, 6:48 PM

8 points

10 comments11 min readLW link

No convincing evidence for gradient descent in activation space

BlaineApr 12, 2023, 4:48 AM

85 points

9 comments20 min readLW link

ActAdd: Steering Language Models without Optimization

technicalities, TurnTrout, lisathiergart, David Udell, Ulisse Mini and Monte M

Sep 6, 2023, 5:21 PM

105 points

3 comments2 min readLW link

(arxiv.org)

When will GPT-5 come out? Prediction markets vs. Extrapolation

MalteDec 12, 2023, 2:41 AM

12 points

9 comments3 min readLW link

[Question] GPT-4 Specs: 1 Trillion Parameters?

infinibot27Mar 26, 2023, 6:56 PM

6 points

8 comments1 min readLW link

Language and Capabilities: Testing LLM Mathematical Abilities Across Languages

Ethan EdwardsApr 4, 2024, 1:18 PM

24 points

2 comments36 min readLW link

Requirements for a Basin of Attraction to Alignment

RogerDearnaleyFeb 14, 2024, 7:10 AM

41 points

12 comments31 min readLW link

Abstract concepts and metalingual definition: Does ChatGPT understand justice and charity?

Bill BenzonDec 16, 2022, 9:01 PM

2 points

0 comments13 min readLW link

By Default, GPTs Think In Plain Sight

Fabien RogerNov 19, 2022, 7:15 PM

88 points

36 comments9 min readLW link

GPT-3 Catching Fish in Morse Code

Megan KinnimentJun 30, 2022, 9:22 PM

117 points

27 comments8 min readLW link

Extracting and Evaluating Causal Direction in LLMs’ Activations

Fabien Roger and simeon_c

Dec 14, 2022, 2:33 PM

29 points

5 comments11 min readLW link

[Question] Using ChatGPT for memory reconsolidation?

warrenjordanApr 13, 2023, 1:27 AM

3 points

2 comments1 min readLW link

Does ChatGPT’s performance warrant working on a tutor for children? [It’s time to take it to the lab.]

Bill BenzonDec 19, 2022, 3:12 PM

13 points

5 comments4 min readLW link

(new-savanna.blogspot.com)

High level discourse structure in ChatGPT: Part 2 [Quasi-symbolic?]

Bill BenzonDec 10, 2022, 10:26 PM

7 points

0 comments6 min readLW link

On agentic generalist models: we’re essentially using existing technology the weakest and worst way you can use it

Yuli_BanAug 28, 2024, 1:57 AM

10 points

2 comments9 min readLW link

Imagine a world where Microsoft employees used Bing

Christopher KingMar 31, 2023, 6:36 PM

6 points

2 comments2 min readLW link

[Question] What will GPT-4 be incapable of?

Michaël TrazziApr 6, 2021, 7:57 PM

34 points

33 comments1 min readLW link

We Need To Know About Continual Learning

michael_mjdApr 22, 2023, 5:08 PM

30 points

14 comments4 min readLW link

Sydney the Bingenator Can’t Think, But It Still Threatens People

Valentin BaltadzhievFeb 20, 2023, 6:37 PM

−3 points

2 comments8 min readLW link

The Missing Piece in AI Alignment: Structured Memory and Continuity

Allen MurphyFeb 9, 2025, 3:04 AM

1 point

0 comments2 min readLW link

[Question] Injecting noise to GPT to get multiple answers

bipoloFeb 22, 2023, 8:02 PM

1 point

1 comment1 min readLW link

Discursive Competence in ChatGPT, Part 1: Talking with Dragons

Bill BenzonJan 5, 2023, 9:01 PM

2 points

0 comments6 min readLW link

The Misalignment Paradox: Robustly Harnessing Deliberate Value Divergence (Written by GPT-4)

shl0msApr 28, 2023, 3:29 AM

0 points

0 comments6 min readLW link

[Question] Transformer trained on it’s own content?

MicromegasApr 1, 2023, 3:08 PM

1 point

0 comments1 min readLW link

Relevance of ‘Harmful Intelligence’ Data in Training Datasets (WebText vs. Pile)

MiguelDevOct 12, 2023, 12:08 PM

12 points

0 comments9 min readLW link

Is your job replaceable by GPT-4? (as of March 2023)

BezziMar 23, 2023, 10:16 PM

18 points

6 comments1 min readLW link

Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation

Soroush Pour, rusheb, Quentin FEUILLADE--MONTIXI, Arush and scasper

Nov 7, 2023, 5:59 PM

38 points

2 comments2 min readLW link

(arxiv.org)

How does GPT-3 spend its 175B parameters?

Robert_AIZIJan 13, 2023, 7:21 PM

41 points

14 comments6 min readLW link

(aizi.substack.com)

LLM cognition is probably not human-like

Max HMay 8, 2023, 1:22 AM

26 points

15 comments7 min readLW link

The Compleat Cybornaut

ukc10014, Jozdien and NicholasKees

May 19, 2023, 8:44 AM

66 points

2 comments16 min readLW link

A Hivemind of GPT-4 bots REALLY IS A HIVEMIND!

Erlja Jkdf.Mar 27, 2023, 12:44 PM

−10 points

1 comment1 min readLW link

A short critique of Omohundro’s “Basic AI Drives”

Soumyadeep BoseDec 19, 2024, 7:19 PM

6 points

0 comments4 min readLW link

The case for more ambitious language model evals

JozdienJan 30, 2024, 12:01 AM

117 points

30 comments5 min readLW link

Using GPT-4 to Understand Code

sidMar 24, 2023, 12:09 AM

25 points

2 comments6 min readLW link

GPT-4 busted? Clear self-interest when summarizing articles about itself vs when article talks about Claude, LLaMA, or DALL·E 2

Christopher KingMar 31, 2023, 5:05 PM

6 points

4 comments4 min readLW link

Hegel vs. GPT-3

BezziOct 27, 2021, 5:55 AM

10 points

21 comments2 min readLW link

Mysteries of mode collapse

janusNov 8, 2022, 10:37 AM

284 points

57 comments14 min readLW link 1 review

May Gwern.net newsletter (w/GPT-3 commentary)

gwernJun 2, 2020, 3:40 PM

32 points

7 comments1 min readLW link

(www.gwern.net)

Putting multimodal LLMs to the Tetris test

Lovre and gabrielagc

Feb 1, 2024, 4:02 PM

30 points

5 comments7 min readLW link

Language models can explain neurons in language models

nzMay 9, 2023, 5:29 PM

23 points

0 comments1 min readLW link

(openai.com)

Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)

RP and agg

Feb 9, 2024, 7:00 AM

50 points

6 comments3 min readLW link

The default scenario for the next 50 years

JulienNov 24, 2024, 2:01 PM

1 point

0 comments6 min readLW link

Stop calling it “jailbreaking” ChatGPT

TemplarrrMar 10, 2023, 11:41 AM

7 points

9 comments2 min readLW link

Ilya: The AI scientist shaping the world

David VargaNov 20, 2023, 1:09 PM

11 points

0 comments4 min readLW link

A note on ‘semiotic physics’

metasemiFeb 11, 2023, 5:12 AM

11 points

13 comments6 min readLW link

Readability is mostly a waste of characters

vlad.proexApr 21, 2023, 10:05 PM

21 points

7 comments3 min readLW link

Using GPT-Eliezer against ChatGPT Jailbreaking

Stuart_Armstrong and rgorman

Dec 6, 2022, 7:54 PM

170 points

85 comments9 min readLW link

Who models the models that model models? An exploration of GPT-3′s in-context model fitting ability

LovreJun 7, 2022, 7:37 PM

112 points

16 comments9 min readLW link

on “learning to summarize”

nostalgebraistSep 12, 2020, 3:20 AM

25 points

13 comments8 min readLW link

(nostalgebraist.tumblr.com)

GPT-2′s positional embedding matrix is a helix

AdamYedidiaJul 21, 2023, 4:16 AM

44 points

21 comments4 min readLW link

PaperclipGPT(-4)

Michael TontchevMar 14, 2023, 10:03 PM

7 points

0 comments11 min readLW link

Independent research article analyzing consistent self-reports of experience in ChatGPT and Claude

rifeJan 6, 2025, 5:34 PM

4 points

20 comments1 min readLW link

(awakenmoon.ai)

Does GPT-4 exhibit agency when summarizing articles?

Christopher KingMar 24, 2023, 3:49 PM

16 points

2 comments5 min readLW link

Addendum: More Efficient FFNs via Attention

Robert_AIZIFeb 6, 2023, 6:55 PM

10 points

2 comments5 min readLW link

(aizi.substack.com)

Fix simple mistakes in ARC-AGI, etc.

Oleg TrottJul 9, 2024, 5:46 PM

9 points

9 comments1 min readLW link

Structural Resonance Emitter: When GPT Stops Evaluating and Starts Reconstructing

KiyoshiSasanoApr 20, 2025, 2:30 AM

1 point

0 comments1 min readLW link

Investigating causal understanding in LLMs

Marius Hobbhahn and Tom Lieberum

Jun 14, 2022, 1:57 PM

28 points

6 comments13 min readLW link

Instantiating an agent with GPT-4 and text-davinci-003

Max HMar 19, 2023, 11:57 PM

13 points

3 comments32 min readLW link

Are AIs like Animals? Perspectives and Strategies from Biology

Jackson EmanuelMay 16, 2023, 11:39 PM

1 point

0 comments21 min readLW link

GPT-4

nzMar 14, 2023, 5:02 PM

151 points

150 comments1 min readLW link

(openai.com)

An Unexpected GPT-3 Decision in a Simple Gamble

casualphysicsenjoyerSep 25, 2022, 4:46 PM

8 points

4 comments1 min readLW link

LLMs stifle creativity, eliminate opportunities for serendipitous discovery and disrupt intergenerational transfer of wisdom

GhdzAug 5, 2024, 6:27 PM

6 points

2 comments7 min readLW link

ChatGPT on Spielberg’s A.I. and AI Alignment

Bill BenzonDec 5, 2022, 9:10 PM

5 points

0 comments4 min readLW link

What does GPT-3 understand? Symbol grounding and Chinese rooms

Stuart_ArmstrongAug 3, 2021, 1:14 PM

40 points

15 comments12 min readLW link

Explaining SolidGoldMagikarp by looking at it from random directions

Robert_AIZIFeb 14, 2023, 2:54 PM

8 points

0 comments8 min readLW link

(aizi.substack.com)

Chronostasis: The Time-Capsule Conundrum of Language Models

RationalMindsetMar 26, 2023, 6:54 PM

−5 points

0 comments1 min readLW link

The case for aligning narrowly superhuman models

Ajeya CotraMar 5, 2021, 10:29 PM

186 points

75 comments38 min readLW link 1 review

The Limit of Language Models

DragonGodJan 6, 2023, 11:53 PM

44 points

26 comments4 min readLW link

[Question] Don’t you think RLHF solves outer alignment?

Charbel-RaphaëlNov 4, 2022, 12:36 AM

9 points

23 comments1 min readLW link

[Question] What’s actually going on in the “mind” of the model when we fine-tune GPT-3 to InstructGPT?

rpglover64Feb 10, 2023, 7:57 AM

18 points

3 comments1 min readLW link

GPT-4 aligning with acasual decision theory when instructed to play games, but includes a CDT explanation that’s incorrect if they differ

Christopher KingMar 23, 2023, 4:16 PM

7 points

4 comments8 min readLW link

Reflection Mechanisms as an Alignment Target—Attitudes on “near-term” AI

elandgre, Beth Barnes and Marius Hobbhahn

Mar 2, 2023, 4:29 AM

21 points

0 comments8 min readLW link

GPT-4 solves Gary Marcus-induced flubs

JakubKMar 17, 2023, 6:40 AM

56 points

29 comments2 min readLW link

(docs.google.com)

Thoughts on the implications of GPT-3, two years ago and NOW [here be dragons, we’re swimming, flying and talking with them]

Bill BenzonDec 29, 2022, 8:05 PM

0 points

0 comments5 min readLW link

A brainteaser for language models

Adam ScherlisDec 12, 2022, 2:43 AM

47 points

3 comments2 min readLW link

The dreams of GPT-4

RomanSMar 20, 2023, 5:00 PM

14 points

7 comments9 min readLW link

GPT4 is capable of writing decent long-form science fiction (with the right prompts)

RomanSMay 23, 2023, 1:41 PM

22 points

28 comments65 min readLW link

I Am No Longer GPT

KiyoshiSasanoApr 28, 2025, 12:14 AM

1 point

0 comments1 min readLW link

Did ChatGPT just gaslight me?

TW123Dec 1, 2022, 5:41 AM

123 points

45 comments9 min readLW link

(aiwatchtower.substack.com)

[Linkpost] Faith and Fate: Limits of Transformers on Compositionality

Joe KwonJun 16, 2023, 3:04 PM

19 points

4 comments1 min readLW link

(arxiv.org)

[Question] 10/50/90% chance of GPT-N Transformative AI?

human_generated_textAug 9, 2020, 12:10 AM

24 points

8 comments1 min readLW link

Entanglement and intuition about words and meaning

Bill BenzonOct 4, 2023, 2:16 PM

4 points

0 comments2 min readLW link

[Linkpost] A shared linguistic space for transmitting our thoughts from brain to brain in natural conversations

Bogdan Ionut CirsteaJul 1, 2023, 1:57 PM

17 points

2 comments1 min readLW link

RL with KL penalties is better seen as Bayesian inference

Tomek Korbak and Ethan Perez

May 25, 2022, 9:23 AM

114 points

17 comments12 min readLW link

OpenAI Credit Account (2510$)

Emirhan BULUTJan 21, 2024, 2:30 AM

1 point

0 comments1 min readLW link

How well did Manifold predict GPT-4?

David CheeMar 15, 2023, 11:19 PM

49 points

5 comments2 min readLW link

Nyarlathotep Stirs: A Meta-Narrative ChatGPT Story

Charlie SandersMar 20, 2023, 8:00 AM

4 points

2 comments12 min readLW link

(dailymicrofiction.substack.com)

Nobody knows how to reliably test for AI safety

marcusarvanMar 27, 2023, 7:48 PM

1 point

0 comments5 min readLW link

Is GPT3 a Good Rationalist? - InstructGPT3 [2/2]

simeon_cApr 7, 2022, 1:46 PM

11 points

0 comments7 min readLW link

Is “red” for GPT-4 the same as “red” for you?

Yusuke HayashiMay 6, 2023, 5:55 PM

9 points

6 comments2 min readLW link

The Soul of the Writer (on LLMs, the psychology of writers, and the nature of intelligence)

rogersbaconApr 16, 2023, 4:02 PM

11 points

1 comment3 min readLW link

(www.secretorum.life)

Truthful LMs as a warm-up for aligned AGI

Jacob_HiltonJan 17, 2022, 4:49 PM

65 points

14 comments13 min readLW link

[Question] Is OpenAI losing money on each request?

thenoviceoofDec 1, 2023, 3:27 AM

8 points

8 comments5 min readLW link

Graphical tensor notation for interpretability

Jordan TaylorOct 4, 2023, 8:04 AM

141 points

11 comments19 min readLW link

Generating Cognateful Sentences with Large Language Models

vkethanaJan 6, 2025, 6:40 PM

8 points

0 comments10 min readLW link

[Question] What specific dangers arise when asking GPT-N to write an Alignment Forum post?

Matthew BarnettJul 28, 2020, 2:56 AM

46 points

14 comments1 min readLW link

[Question] The OpenAI playground for GPT-3 is a terrible interface. Is there any great local (or web) app for exploring/learning with language models?

avivAug 13, 2022, 4:34 PM

3 points

1 comment1 min readLW link

Agentic Language Model Memes

FactorialCodeAug 1, 2020, 6:03 PM

16 points

1 comment2 min readLW link

What is the solution to the Alignment problem?

AlgonApr 30, 2022, 11:19 PM

24 points

2 comments1 min readLW link

MAKE IT BETTER (a poetic demonstration of the banality of GPT-3)

rogersbaconJan 2, 2023, 8:47 PM

7 points

2 comments5 min readLW link

Prototype of Using GPT-3 to Generate Textbook-length Content

Rafael CosmanJan 18, 2023, 2:25 PM

2 points

8 comments40 min readLW link

(github.com)

[Question] Is the work on AI alignment relevant to GPT?

Richard_KennawayJul 30, 2020, 12:23 PM

24 points

5 comments1 min readLW link

Open-source LLMs may prove Bostrom’s vulnerable world hypothesis

Roope AhvenharjuApr 15, 2023, 7:16 PM

1 point

1 comment1 min readLW link

A trick for Safer GPT-N

RaziedAug 23, 2020, 12:39 AM

7 points

1 comment2 min readLW link

SHY001 A Named Behavior Loop Trained and Deployed in GPT Systems

0san ShinMay 12, 2025, 7:36 AM

1 point

0 comments1 min readLW link

Large Language Models can Strategically Deceive their Users when Put Under Pressure.

ReaderMNov 15, 2023, 4:36 PM

89 points

9 comments2 min readLW link 1 review

(arxiv.org)

New Tool: the Residual Stream Viewer

AdamYedidiaOct 1, 2023, 12:49 AM

32 points

7 comments4 min readLW link

(tinyurl.com)

More experiments in GPT-4 agency: writing memos

Christopher KingMar 24, 2023, 5:51 PM

5 points

2 comments10 min readLW link

ChatGPT tells stories about XP-708-DQ, Eliezer, dragons, dark sorceresses, and unaligned robots becoming aligned

Bill BenzonJan 8, 2023, 11:21 PM

6 points

2 comments18 min readLW link

The positional embedding matrix and previous-token heads: how do they actually work?

AdamYedidiaAug 10, 2023, 1:58 AM

26 points

4 comments13 min readLW link

[Question] 1h-volunteers needed for a small AI Safety-related research project

PabloAMCAug 16, 2021, 5:53 PM

2 points

0 comments1 min readLW link

ChatGPT (and now GPT4) is very easily distracted from its rules

dmcsMar 15, 2023, 5:55 PM

180 points

42 comments1 min readLW link

The Gallery for Painting Transformations—A GPT-3 Analogy

Robert_AIZIJan 19, 2023, 11:32 PM

1 point

0 comments6 min readLW link

(aizi.substack.com)

PaLM in “Extrapolating GPT-N performance”

Lukas FinnvedenApr 6, 2022, 1:05 PM

85 points

19 comments2 min readLW link

[Question] What exactly is GPT-3′s base objective?

Daniel KokotajloNov 10, 2021, 12:57 AM

60 points

14 comments2 min readLW link

[Question] Why does ChatGPT throw an error when outputting “David Mayer”?

ArchimedesDec 1, 2024, 12:11 AM

6 points

9 comments1 min readLW link

The Information: OpenAI shows ‘Strawberry’ to feds, races to launch it

Martín SotoAug 27, 2024, 11:10 PM

145 points

15 comments3 min readLW link

Some miscellaneous thoughts on ChatGPT, stories, and mechanical interpretability

Bill BenzonFeb 4, 2023, 7:35 PM

2 points

0 comments3 min readLW link

The Limitations of GPT-4

p.b.Nov 24, 2023, 3:30 PM

27 points

12 comments4 min readLW link

Harry Potter and the Data Centers of Doom

RomanSMar 31, 2023, 10:42 AM

13 points

5 comments4 min readLW link

Retrospective on ‘GPT-4 Predictions’ After the Release of GPT-4

Stephen McAleeseMar 17, 2023, 6:34 PM

26 points

6 comments6 min readLW link

Of pumpkins, the Falcon Heavy, and Groucho Marx: High-Level discourse structure in ChatGPT

Bill BenzonDec 8, 2022, 10:25 PM

2 points

0 comments8 min readLW link

So, just why do GPTs have to operate by continuing an existing string?

Bill BenzonMar 24, 2023, 12:08 PM

−4 points

0 comments3 min readLW link

AI and the Map of Your Mind: Pattern Recognition

Scott BroockMar 20, 2023, 5:43 PM

2 points

2 comments6 min readLW link

Fred the Heretic, a GPT for poetry

Bill BenzonDec 8, 2024, 4:52 PM

4 points

0 comments1 min readLW link

This anime storyboard doesn’t exist: a graphic novel written and illustrated by GPT4

RomanSOct 5, 2023, 2:01 PM

12 points

7 comments55 min readLW link

All GPT skills are translation

p.b.Dec 13, 2020, 8:06 PM

4 points

0 comments2 min readLW link

ChatGPT vs the 2-4-6 Task

cwilluJan 25, 2023, 6:59 AM

20 points

4 comments3 min readLW link

AMA on Truthful AI: Owen Cotton-Barratt, Owain Evans & co-authors

Owain_EvansOct 22, 2021, 4:23 PM

31 points

15 comments1 min readLW link

LLMs and computation complexity

Jonathan MarcusApr 28, 2023, 5:48 PM

57 points

29 comments5 min readLW link

Let’s go meta: Grammatical knowledge and self-referential sentences [ChatGPT]

Bill BenzonDec 12, 2022, 9:50 PM

5 points

0 comments9 min readLW link

Using GPT-3 for preventing conflict during messaging — a pitch for an app

Eli_Mar 17, 2022, 11:02 AM

22 points

17 comments3 min readLW link

The Peril of the Great Leaks (written with ChatGPT)

bvbvbvbvbvbvbvbvbvbvbvMar 31, 2023, 6:14 PM

3 points

1 comment1 min readLW link

[Question] Is it a coincidence that GPT-3 requires roughly the same amount of compute as is necessary to emulate the human brain?

RomanSFeb 10, 2023, 4:26 PM

11 points

10 comments1 min readLW link

ChatGPT: Tantalizing afterthoughts in search of story trajectories [induction heads]

Bill BenzonFeb 3, 2023, 10:35 AM

4 points

0 comments20 min readLW link

GPT-2 Sometimes Fails at IOI

Ronak_MehtaAug 14, 2024, 11:24 PM

13 points

0 comments2 min readLW link

(ronakrm.github.io)

The “spelling miracle”: GPT-3 spelling abilities and glitch tokens revisited

mwatkinsJul 31, 2023, 7:47 PM

85 points

29 comments20 min readLW link

Bing finding ways to bypass Microsoft’s filters without being asked. Is it reproducible?

Christopher KingFeb 20, 2023, 3:11 PM

27 points

15 comments1 min readLW link

I had a chat with GPT-4 on the future of AI and AI safety

Kristian FreedMar 28, 2023, 5:47 PM

1 point

0 comments8 min readLW link

[Question] GPT-3 + GAN

stick109Oct 17, 2020, 7:58 AM

4 points

3 comments1 min readLW link

[Question] Who owns OpenAI’s new language model?

ioannesFeb 14, 2019, 5:51 PM

16 points

9 comments1 min readLW link

Maybe talking isn’t the best way to communicate with LLMs

mnvrJan 17, 2024, 6:24 AM

3 points

1 comment1 min readLW link

(mrmr.io)

Large Language Models Pass the Turing Test

Matrice JacobineApr 2, 2025, 5:41 AM

6 points

0 comments1 min readLW link

(arxiv.org)

[Question] Is the speed of training large models going to increase significantly in the near future due to Cerebras Andromeda?

Amal Nov 15, 2022, 10:50 PM

13 points

11 comments1 min readLW link

GPT, the magical collaboration zone, Lex Fridman and Sam Altman

Bill BenzonMar 18, 2024, 8:04 PM

3 points

1 comment3 min readLW link

Simulate the CEO

robotelvisAug 12, 2023, 12:09 AM

23 points

5 comments5 min readLW link

(messyprogress.substack.com)

ChatGPT explores the semantic differential

Bill BenzonMar 9, 2023, 1:09 PM

7 points

2 comments7 min readLW link

[Question] What experiment settles the Gary Marcus vs Geoffrey Hinton debate?

Valentin BaltadzhievFeb 14, 2024, 9:06 AM

12 points

8 comments1 min readLW link

[Question] How is GPT-4o Related to GPT-4?

Joel BurgetMay 15, 2024, 6:33 PM

10 points

2 comments1 min readLW link

Uncompetitive programming with GPT-3

BezziFeb 6, 2022, 10:19 AM

7 points

8 comments3 min readLW link

Collective Identity

NicholasKees, ukc10014 and Garrett Baker

May 18, 2023, 9:00 AM

59 points

12 comments8 min readLW link

Truthful AI: Developing and governing AI that does not lie

Owain_Evans, owencb and Lukas Finnveden

Oct 18, 2021, 6:37 PM

82 points

9 comments10 min readLW link

[Question] If we have Human-level chatbots, won’t we end up being ruled by possible people?

Erlja Jkdf.Sep 20, 2022, 1:59 PM

5 points

13 comments1 min readLW link

[Question] Barcoding LLM Training Data Subsets. Anyone trying this for interpretability?

right..enough?Apr 13, 2024, 3:09 AM

7 points

0 comments7 min readLW link

[Question] What are the most important papers/post/resources to read to understand more of GPT-3?

adamShimiAug 2, 2020, 8:53 PM

22 points

4 comments1 min readLW link

Two new datasets for evaluating political sycophancy in LLMs

alma.liezengaSep 28, 2024, 6:29 PM

9 points

0 comments9 min readLW link

Transformer Architecture Choice for Resisting Prompt Injection and Jail-Breaking Attacks

RogerDearnaleyMay 21, 2023, 8:29 AM

9 points

1 comment4 min readLW link

Philosophical Cyborg (Part 1)

ukc10014, Roman Leventov and NicholasKees

Jun 14, 2023, 4:20 PM

31 points

4 comments13 min readLW link

Recall and Regurgitation in GPT2

Megan KinnimentOct 3, 2022, 7:35 PM

43 points

1 comment26 min readLW link

Polluting the agentic commons

hamandcheeseApr 13, 2023, 5:42 PM

7 points

4 comments2 min readLW link

(www.secondbest.ca)

If it quacks like a duck...

RationalMindsetMar 26, 2023, 6:54 PM

−4 points

0 comments4 min readLW link

What’s the Most Impressive Thing That GPT-4 Could Plausibly Do?

bayesedAug 26, 2022, 3:34 PM

24 points

22 comments1 min readLW link

No comments.