Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
2
Update on the UK AI Summit and the UK’s Plans
Elliot Mckernon
Nov 10, 2023, 2:47 PM
11
points
0
comments
8
min read
LW
link
Liv Boeree Ted Talk Moloch & AI
Neil
Nov 10, 2023, 2:04 PM
10
points
2
comments
1
min read
LW
link
(m.youtube.com)
Picking Mentors For Research Programmes
Raymond Douglas
Nov 10, 2023, 1:01 PM
105
points
8
comments
4
min read
LW
link
GPT-2030 and Catastrophic Drives: Four Vignettes
jsteinhardt
Nov 10, 2023, 7:30 AM
50
points
5
comments
10
min read
LW
link
(bounded-regret.ghost.io)
Crock, Crocker, Crockiest
Screwtape
Nov 10, 2023, 6:14 AM
21
points
4
comments
6
min read
LW
link
AI Timelines
habryka
,
Daniel Kokotajlo
,
Ajeya Cotra
and
Ege Erdil
Nov 10, 2023, 5:28 AM
300
points
136
comments
51
min read
LW
link
2
reviews
ACI#6: A Non-Dualistic ACI Model
Akira Pyinya
Nov 9, 2023, 11:01 PM
10
points
2
comments
6
min read
LW
link
How I got so excited about HowTruthful
Bruce Lewis
Nov 9, 2023, 6:49 PM
17
points
3
comments
5
min read
LW
link
The case for “Generous Tit for Tat” as the ultimate game theory strategy
positivesum
Nov 9, 2023, 6:41 PM
2
points
3
comments
8
min read
LW
link
(tryingtruly.substack.com)
Text Posts from the Kids Group: 2021
jefftk
Nov 9, 2023, 5:50 PM
38
points
1
comment
8
min read
LW
link
(www.jefftk.com)
AI #37: Moving Too Fast
Zvi
Nov 9, 2023, 5:50 PM
53
points
5
comments
76
min read
LW
link
(thezvi.wordpress.com)
Learning-theoretic agenda reading list
Vanessa Kosoy
Nov 9, 2023, 5:25 PM
103
points
1
comment
2
min read
LW
link
1
review
Open-ended/Phenomenal Ethics (TLDR)
Ryo
Nov 9, 2023, 4:58 PM
3
points
0
comments
1
min read
LW
link
Polysemantic Attention Head in a 4-Layer Transformer
Jett Janiak
,
cmathw
and
StefanHex
Nov 9, 2023, 4:16 PM
51
points
0
comments
6
min read
LW
link
On OpenAI Dev Day
Zvi
Nov 9, 2023, 4:10 PM
60
points
0
comments
15
min read
LW
link
(thezvi.wordpress.com)
Antropical Probabilities Are Fully Explained by Difference in Possible Outcomes
Ape in the coat
Nov 9, 2023, 3:34 PM
19
points
7
comments
5
min read
LW
link
A free to enter, 240 character, open-source iterated prisoner’s dilemma tournament
Isaac King
Nov 9, 2023, 8:24 AM
64
points
19
comments
1
min read
LW
link
(manifold.markets)
Into AI Safety Episodes 1 & 2
jacobhaimes
Nov 9, 2023, 4:36 AM
2
points
0
comments
1
min read
LW
link
(into-ai-safety.github.io)
Making Bad Decisions On Purpose
Screwtape
Nov 9, 2023, 3:36 AM
49
points
8
comments
5
min read
LW
link
Metaculus’s New Sidebar Helps You Find Forecasts Faster
ChristianWilliams
Nov 8, 2023, 8:56 PM
15
points
0
comments
LW
link
(www.metaculus.com)
Open-ended ethics of phenomena (a desiderata with universal morality)
Ryo
Nov 8, 2023, 8:10 PM
1
point
0
comments
8
min read
LW
link
Open Agency model can solve the AI regulation dilemma
Roman Leventov
Nov 8, 2023, 8:00 PM
22
points
1
comment
2
min read
LW
link
Gothenburg LW / ACX meetup
Stefan
Nov 8, 2023, 7:52 PM
1
point
0
comments
1
min read
LW
link
[Question]
Why is lesswrong blocking wget and curl (scrape)?
nick lacombe
Nov 8, 2023, 7:42 PM
21
points
15
comments
1
min read
LW
link
[Question]
Is there a lesswrong archive of all public posts?
nick lacombe
Nov 8, 2023, 7:26 PM
12
points
7
comments
1
min read
LW
link
Five projects from AI Safety Hub Labs 2023
charlie_griffin
Nov 8, 2023, 7:19 PM
47
points
1
comment
6
min read
LW
link
(www.aisafetyhub.org)
[Question]
Can a stupid person become intelligent?
A. T.
Nov 8, 2023, 7:01 PM
12
points
24
comments
2
min read
LW
link
Prosthetic Intelligence
Krantz
Nov 8, 2023, 7:01 PM
7
points
9
comments
2
min read
LW
link
[Question]
Do you have a satisfactory workflow for learning about a line of research using GPT4, Claude, etc?
ryan_b
Nov 8, 2023, 6:05 PM
9
points
3
comments
1
min read
LW
link
What’s going on? LLMs and IS-A sentences
Bill Benzon
Nov 8, 2023, 4:58 PM
6
points
15
comments
4
min read
LW
link
[Question]
What will happen with real estate prices during a slow takeoff?
Ricardo Meneghin
Nov 8, 2023, 11:58 AM
8
points
1
comment
1
min read
LW
link
Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models
Felix Hofstätter
,
Francis Rhys Ward
,
HarrietW
,
LAThomson
,
Ollie J
,
Patrik Bartak
and
Sam F. Brown
Nov 8, 2023, 11:37 AM
49
points
0
comments
18
min read
LW
link
How well does your research adress the theory-practice gap?
Jonas Hallgren
Nov 8, 2023, 11:27 AM
18
points
0
comments
10
min read
LW
link
Growth and Form in a Toy Model of Superposition
Liam Carroll
and
Edmund Lau
Nov 8, 2023, 11:08 AM
90
points
7
comments
14
min read
LW
link
Running your own workshop on handling hostile disagreements
Camille Berger
Nov 8, 2023, 10:28 AM
12
points
1
comment
7
min read
LW
link
Thinking By The Clock
Screwtape
Nov 8, 2023, 7:40 AM
197
points
29
comments
8
min read
LW
link
1
review
[Question]
Impressions from base-GPT-4?
mishka
Nov 8, 2023, 5:43 AM
25
points
25
comments
1
min read
LW
link
Quantopian contest, but for food intake and weight
Lucent
Nov 8, 2023, 5:41 AM
40
points
9
comments
3
min read
LW
link
How I Think, Part Two: Distrusting Individuals
Richard Henage
Nov 8, 2023, 4:06 AM
4
points
6
comments
3
min read
LW
link
How I Think, Part One: Investing in Fun
Richard Henage
Nov 8, 2023, 4:00 AM
5
points
2
comments
5
min read
LW
link
Concrete positive visions for a future without AGI
Max H
Nov 8, 2023, 3:12 AM
41
points
28
comments
8
min read
LW
link
South Bay ACX/LW/EA Meetup & Vegansgiving Potluck
IS
Nov 8, 2023, 2:30 AM
10
points
0
comments
1
min read
LW
link
Progress links digest, 2023-11-07: Techno-optimism and more
jasoncrawford
Nov 8, 2023, 2:05 AM
17
points
7
comments
11
min read
LW
link
(rootsofprogress.org)
Announcing Athena—Women in AI Alignment Research
Claire Short
Nov 7, 2023, 9:46 PM
80
points
2
comments
3
min read
LW
link
Vote on Interesting Disagreements
Ben Pace
Nov 7, 2023, 9:35 PM
159
points
131
comments
1
min read
LW
link
What is democracy for?
Johnstone
Nov 7, 2023, 6:17 PM
−5
points
10
comments
7
min read
LW
link
Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation
Soroush Pour
,
rusheb
,
Quentin FEUILLADE--MONTIXI
,
Arush
and
scasper
Nov 7, 2023, 5:59 PM
38
points
2
comments
2
min read
LW
link
(arxiv.org)
Implementing Decision Theory
justinpombrio
Nov 7, 2023, 5:55 PM
22
points
12
comments
3
min read
LW
link
Mirror, Mirror on the Wall: How Do Forecasters Fare by Their Own Call?
nikos
Nov 7, 2023, 5:39 PM
14
points
5
comments
14
min read
LW
link
Symbiotic self-alignment of AIs.
Spiritus Dei
Nov 7, 2023, 5:18 PM
1
point
0
comments
3
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel