Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Page
2
There are (probably) no superhuman Go AIs: strong human players beat the strongest AIs
Taran
Feb 19, 2023, 12:25 PM
125
points
34
comments
4
min read
LW
link
Navigating public AI x-risk hype while pursuing technical solutions
Dan Braun
Feb 19, 2023, 12:22 PM
18
points
0
comments
2
min read
LW
link
Somewhat against “just update all the way”
tailcalled
Feb 19, 2023, 10:49 AM
31
points
10
comments
2
min read
LW
link
Human beats SOTA Go AI by learning an adversarial policy
Vanessa Kosoy
Feb 19, 2023, 9:38 AM
59
points
32
comments
1
min read
LW
link
(goattack.far.ai)
Degamification
Nate Showell
Feb 19, 2023, 5:35 AM
23
points
2
comments
2
min read
LW
link
Stop posting prompt injections on Twitter and calling it “misalignment”
lc
Feb 19, 2023, 2:21 AM
144
points
9
comments
1
min read
LW
link
AGI in sight: our look at the game board
Andrea_Miotti
and
Gabriel Alfour
Feb 18, 2023, 10:17 PM
227
points
135
comments
6
min read
LW
link
(andreamiotti.substack.com)
We should be signal-boosting anti Bing chat content
mbrooks
Feb 18, 2023, 6:52 PM
−4
points
13
comments
2
min read
LW
link
Can talk, can think, can suffer.
Ilio
Feb 18, 2023, 6:43 PM
1
point
8
comments
3
min read
LW
link
Parametrically retargetable decision-makers tend to seek power
TurnTrout
Feb 18, 2023, 6:41 PM
172
points
10
comments
2
min read
LW
link
(arxiv.org)
Near-Term Risks of an Obedient Artificial Intelligence
ymeskhout
Feb 18, 2023, 6:30 PM
20
points
1
comment
6
min read
LW
link
EIS VII: A Challenge for Mechanists
scasper
Feb 18, 2023, 6:27 PM
36
points
4
comments
3
min read
LW
link
Reading Speed Exists!
Johannes C. Mayer
Feb 18, 2023, 3:30 PM
12
points
9
comments
1
min read
LW
link
The Practitioner’s Path 2.0: the Meditative Archetype
Evenflair
Feb 18, 2023, 3:23 PM
14
points
1
comment
2
min read
LW
link
(guildoftherose.org)
Should we cry “wolf”?
Tapatakt
Feb 18, 2023, 11:24 AM
24
points
5
comments
1
min read
LW
link
[Question]
Name of the fallacy of assuming an extreme value (e.g. 0) with the illusion of ‘avoiding to have to make an assumption’?
FlorianH
Feb 18, 2023, 8:11 AM
4
points
1
comment
1
min read
LW
link
I Think We’re Approaching The Bitter Lesson’s Asymptote
SomeoneYouOnceKnew
Feb 18, 2023, 5:33 AM
−3
points
9
comments
5
min read
LW
link
Bus-Only Bus Lane Enforcement
jefftk
Feb 18, 2023, 2:50 AM
19
points
15
comments
1
min read
LW
link
(www.jefftk.com)
Run Head on Towards the Falling Tears
Johannes C. Mayer
Feb 18, 2023, 1:33 AM
6
points
0
comments
2
min read
LW
link
Two problems with ‘Simulators’ as a frame
ryan_greenblatt
Feb 17, 2023, 11:34 PM
79
points
13
comments
5
min read
LW
link
GPT-4 Predictions
Stephen McAleese
Feb 17, 2023, 11:20 PM
110
points
27
comments
11
min read
LW
link
On Board Vision, Hollow Words, and the End of the World
Marcello
Feb 17, 2023, 11:18 PM
52
points
27
comments
5
min read
LW
link
PICT: A Zero-Shot Prompt Template to Automate Evaluation
Quentin FEUILLADE--MONTIXI
Feb 17, 2023, 11:16 PM
17
points
1
comment
11
min read
LW
link
Hunch seeds: Info bio
the gears to ascension
Feb 17, 2023, 9:25 PM
12
points
0
comments
9
min read
LW
link
Why Do We Believe
Screwtape
Feb 17, 2023, 8:58 PM
9
points
3
comments
3
min read
LW
link
I Am Scared of Posting Negative Takes About Bing’s AI
Yitz
Feb 17, 2023, 8:50 PM
63
points
28
comments
1
min read
LW
link
EIS VI: Critiques of Mechanistic Interpretability Work in AI Safety
scasper
Feb 17, 2023, 8:48 PM
49
points
9
comments
12
min read
LW
link
Tinker Bell Theory and LLMs
Fergus Fettes
Feb 17, 2023, 8:23 PM
1
point
11
comments
1
min read
LW
link
Recommendation: Bug Bounties and Responsible Disclosure for Advanced ML Systems
Vaniver
Feb 17, 2023, 8:11 PM
125
points
12
comments
2
min read
LW
link
Microsoft and OpenAI, stop telling chatbots to roleplay as AI
hold_my_fish
Feb 17, 2023, 7:55 PM
50
points
10
comments
1
min read
LW
link
A warm-up for the AI governance project
jacek
Feb 17, 2023, 6:06 PM
10
points
2
comments
3
min read
LW
link
Link Post > Blog Post
party girl
Feb 17, 2023, 5:59 PM
4
points
6
comments
1
min read
LW
link
(onthespectrumontheguestlist.substack.com)
One-layer transformers aren’t equivalent to a set of skip-trigrams
Buck
Feb 17, 2023, 5:26 PM
127
points
11
comments
7
min read
LW
link
[Question]
Should we be kind and polite to emerging AIs?
David Gross
Feb 17, 2023, 4:58 PM
9
points
13
comments
1
min read
LW
link
Follow-up Posting on Cyborg Psychologist
Hopkins Stanley
Feb 17, 2023, 4:56 PM
0
points
2
comments
1
min read
LW
link
(www.lesswrong.com)
A “slow takeoff” might still look fast
MichaelDickens
Feb 17, 2023, 4:51 PM
5
points
3
comments
1
min read
LW
link
AI Safety Info Distillation Fellowship
Robert Miles
and
mwatkins
Feb 17, 2023, 4:16 PM
47
points
3
comments
3
min read
LW
link
Nozick’s Dilemma: A Critique of Game Theory
Edward P. Könings
Feb 17, 2023, 4:11 PM
10
points
1
comment
13
min read
LW
link
[Question]
Are LLMs sufficient for AI takeoff?
rpglover64
Feb 17, 2023, 3:46 PM
8
points
2
comments
1
min read
LW
link
Sydney’s Secret: A Short Story by Bing Chat
fela
Feb 17, 2023, 1:31 PM
36
points
1
comment
5
min read
LW
link
Automating Consistency
Hoagy
Feb 17, 2023, 1:24 PM
10
points
0
comments
1
min read
LW
link
Human decision processes are not well factored
remember
and
Gabriel Alfour
Feb 17, 2023, 1:11 PM
33
points
3
comments
2
min read
LW
link
2023 ACX Predictions: Buy/Sell/Hold
Zvi
Feb 17, 2023, 1:10 PM
25
points
3
comments
20
min read
LW
link
(thezvi.wordpress.com)
Bing chat is the AI fire alarm
Ratios
Feb 17, 2023, 6:51 AM
115
points
63
comments
3
min read
LW
link
Seeing more whole
Joe Carlsmith
Feb 17, 2023, 5:12 AM
31
points
1
comment
26
min read
LW
link
Powerful mesa-optimisation is already here
Roman Leventov
Feb 17, 2023, 4:59 AM
35
points
1
comment
2
min read
LW
link
(arxiv.org)
Self-Reference Breaks the Orthogonality Thesis
lsusr
Feb 17, 2023, 4:11 AM
43
points
35
comments
2
min read
LW
link
The public supports regulating AI for safety
Zach Stein-Perlman
Feb 17, 2023, 4:10 AM
114
points
9
comments
1
min read
LW
link
(aiimpacts.org)
Bring “Ban faster SIMD semiconductors” into the Overton window
worried-techno-optimist
Feb 17, 2023, 3:27 AM
−7
points
1
comment
2
min read
LW
link
Republishing an old essay in light of current news on Bing’s AI: “Regarding Blake Lemoine’s claim that LaMDA is ‘sentient’, he might be right (sorta), but perhaps not for the reasons he thinks”
philosophybear
Feb 17, 2023, 3:27 AM
3
points
0
comments
5
min read
LW
link
(philosophybear.substack.com)
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel