Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Shutting Down the Lightcone Offices
habryka
and
Ben Pace
Mar 14, 2023, 10:47 PM
338
points
103
comments
17
min read
LW
link
2
reviews
[Question]
What are some ideas that LessWrong has reinvented?
RomanHauksson
Mar 14, 2023, 10:27 PM
4
points
13
comments
1
min read
LW
link
Human preferences as RL critic values—implications for alignment
Seth Herd
Mar 14, 2023, 10:10 PM
26
points
6
comments
6
min read
LW
link
PaperclipGPT(-4)
Michael Tontchev
Mar 14, 2023, 10:03 PM
7
points
0
comments
11
min read
LW
link
GPT-4 developer livestream
Gerald Monroe
Mar 14, 2023, 8:55 PM
9
points
0
comments
1
min read
LW
link
(www.youtube.com)
[Question]
Main actors in the AI race
Marta
Mar 14, 2023, 8:50 PM
3
points
1
comment
1
min read
LW
link
Success without dignity: a nearcasting story of avoiding catastrophe by luck
HoldenKarnofsky
Mar 14, 2023, 7:23 PM
76
points
17
comments
15
min read
LW
link
GPT can write Quines now (GPT-4)
Andrew_Critch
Mar 14, 2023, 7:18 PM
112
points
30
comments
1
min read
LW
link
Vector semantics and the (in-context) construction of meaning in Coleridge’s “Kubla Khan”
Bill Benzon
Mar 14, 2023, 7:16 PM
4
points
0
comments
7
min read
LW
link
A better analogy and example for teaching AI takeover: the ML Inferno
Christopher King
Mar 14, 2023, 7:14 PM
18
points
0
comments
5
min read
LW
link
PaLM API & MakerSuite
Gabe M
Mar 14, 2023, 7:08 PM
20
points
1
comment
1
min read
LW
link
(developers.googleblog.com)
What is a definition, how can it be extrapolated?
Stuart_Armstrong
Mar 14, 2023, 6:08 PM
34
points
5
comments
7
min read
LW
link
Cambridge LW: Rationality Practice: The Map is Not the Territory
Darmani
Mar 14, 2023, 5:56 PM
6
points
0
comments
1
min read
LW
link
[Question]
Beneficial initial conditions for AGI
mikbp
Mar 14, 2023, 5:41 PM
1
point
3
comments
1
min read
LW
link
[Question]
“The elephant in the room: the biggest risk of artificial intelligence may not be what we think” What to say about that?
Obladi Oblada
Mar 14, 2023, 5:37 PM
−5
points
0
comments
3
min read
LW
link
GPT-4
nz
Mar 14, 2023, 5:02 PM
151
points
150
comments
1
min read
LW
link
(openai.com)
Storytelling Makes GPT-3.5 Deontologist: Unexpected Effects of Context on LLM Behavior
Edmund Mills
and
Scott Emmons
Mar 14, 2023, 8:44 AM
17
points
0
comments
12
min read
LW
link
Forecasting Authoritarian and Sovereign Power uses of Large Language Models
K. Liam Smith
Mar 14, 2023, 8:44 AM
7
points
0
comments
8
min read
LW
link
(taboo.substack.com)
Fixed points in mortal population games
ViktoriaMalyasova
Mar 14, 2023, 7:10 AM
31
points
0
comments
12
min read
LW
link
(www.lesswrong.com)
To determine alignment difficulty, we need to know the absolute difficulty of alignment generalization
Jeffrey Ladish
Mar 14, 2023, 3:52 AM
12
points
3
comments
2
min read
LW
link
EA & LW Forum Weekly Summary (6th − 12th March 2023)
Zoe Williams
Mar 14, 2023, 3:01 AM
7
points
0
comments
LW
link
Alpaca: A Strong Open-Source Instruction-Following Model
sanxiyn
Mar 14, 2023, 2:41 AM
26
points
2
comments
1
min read
LW
link
(crfm.stanford.edu)
Discussion with Nate Soares on a key alignment difficulty
HoldenKarnofsky
Mar 13, 2023, 9:20 PM
267
points
43
comments
22
min read
LW
link
1
review
What Discovering Latent Knowledge Did and Did Not Find
Fabien Roger
Mar 13, 2023, 7:29 PM
166
points
17
comments
11
min read
LW
link
South Bay ACX/LW Meetup
IS
Mar 13, 2023, 6:25 PM
2
points
0
comments
1
min read
LW
link
Could Roko’s basilisk acausally bargain with a paperclip maximizer?
Christopher King
Mar 13, 2023, 6:21 PM
1
point
8
comments
1
min read
LW
link
Bayesian optimization to find molecules that bind to proteins
rotatingpaguro
Mar 13, 2023, 6:17 PM
1
point
0
comments
1
min read
LW
link
(www.youtube.com)
Linkpost: ‘Dissolving’ AI Risk – Parameter Uncertainty in AI Future Forecasting
DavidW
Mar 13, 2023, 4:52 PM
6
points
0
comments
1
min read
LW
link
(forum.effectivealtruism.org)
Decentralized Exclusion
jefftk
Mar 13, 2023, 3:50 PM
26
points
19
comments
2
min read
LW
link
(www.jefftk.com)
Linkpost: A Contra AI FOOM Reading List
DavidW
Mar 13, 2023, 2:45 PM
25
points
4
comments
1
min read
LW
link
(magnusvinding.com)
Linkpost: A tale of 2.5 orthogonality theses
DavidW
Mar 13, 2023, 2:19 PM
9
points
3
comments
1
min read
LW
link
(forum.effectivealtruism.org)
Plan for mediocre alignment of brain-like [model-based RL] AGI
Steven Byrnes
Mar 13, 2023, 2:11 PM
68
points
25
comments
12
min read
LW
link
Against AGI Timelines
Jonathan Yan
Mar 13, 2023, 1:33 PM
13
points
3
comments
1
min read
LW
link
(benlandautaylor.com)
What is calibration?
AlexMennen
Mar 13, 2023, 6:30 AM
27
points
1
comment
4
min read
LW
link
On taking AI risk seriously
Eleni Angelou
Mar 13, 2023, 5:50 AM
6
points
0
comments
1
min read
LW
link
(www.nytimes.com)
Nose / throat treatments for respiratory infections
juliawise
Mar 13, 2023, 2:41 AM
47
points
6
comments
8
min read
LW
link
Gold, Silver, Red: A color scheme for understanding people
Michael Soareverix
Mar 13, 2023, 1:06 AM
17
points
2
comments
4
min read
LW
link
Yudkowsky on AGI risk on the Bankless podcast
Rob Bensinger
Mar 13, 2023, 12:42 AM
83
points
5
comments
LW
link
Thoughts on self-inspecting neural networks.
Deruwyn
Mar 12, 2023, 11:58 PM
4
points
2
comments
5
min read
LW
link
An AI risk argument that resonates with NYTimes readers
Julian Bradshaw
Mar 12, 2023, 11:09 PM
212
points
14
comments
1
min read
LW
link
Musicians and Mouths
jefftk
Mar 12, 2023, 10:50 PM
13
points
7
comments
2
min read
LW
link
(www.jefftk.com)
Are there cognitive realms?
TsviBT
Mar 12, 2023, 7:28 PM
34
points
3
comments
10
min read
LW
link
1
review
[Question]
What happened on the Extropians message board?
politicalpersuasion
Mar 12, 2023, 7:22 PM
−53
points
1
comment
1
min read
LW
link
Creating a Discord server for Mechanistic Interpretability Projects
Victor Levoso
Mar 12, 2023, 6:00 PM
30
points
6
comments
2
min read
LW
link
Paper Replication Walkthrough: Reverse-Engineering Modular Addition
Neel Nanda
Mar 12, 2023, 1:25 PM
18
points
0
comments
1
min read
LW
link
(neelnanda.io)
What problems do African-Americans face? An initial investigation using Standpoint Epistemology and Surveys
tailcalled
Mar 12, 2023, 11:42 AM
34
points
26
comments
15
min read
LW
link
“Liquidity” vs “solvency” in bank runs (and some notes on Silicon Valley Bank)
rossry
Mar 12, 2023, 9:16 AM
108
points
27
comments
12
min read
LW
link
“You’ll Never Persuade People Like That”
Zack_M_Davis
Mar 12, 2023, 5:38 AM
18
points
31
comments
2
min read
LW
link
Parasitic Language Games: maintaining ambiguity to hide conflict while burning the commons
Hazard
Mar 12, 2023, 5:25 AM
115
points
17
comments
13
min read
LW
link
[Question]
Is there a way to sort LW search results by date posted?
zeshen
Mar 12, 2023, 4:56 AM
5
points
1
comment
1
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel