Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
Betting and forecasting
CarlJ
Sep 9, 2023, 8:03 PM
2
points
0
comments
1
min read
LW
link
AI presidents discuss AI alignment agendas
TurnTrout
and
Garrett Baker
Sep 9, 2023, 6:55 PM
217
points
23
comments
1
min read
LW
link
(www.youtube.com)
Probabilistic argument relationships and an invitation to the argument mapping community
lunatic_at_large
Sep 9, 2023, 6:45 PM
13
points
4
comments
10
min read
LW
link
How teams went about their research at AI Safety Camp edition 8
Remmelt
,
Linda Linsefors
and
Kristi Uustalu
Sep 9, 2023, 4:34 PM
28
points
0
comments
13
min read
LW
link
Panel discussion on AI consciousness with Rob Long and Jeff Sebo
Aaron Bergman
Sep 9, 2023, 3:38 AM
10
points
0
comments
LW
link
(www.youtube.com)
Possible Divergence in AGI Risk Tolerance between Selfish and Altruistic agents
Brad West
Sep 9, 2023, 12:23 AM
1
point
1
comment
2
min read
LW
link
Capture the Flag Mechanistic Interpretability Challenges
Alejandro Acelas
and
Alexandre Variengien
Sep 8, 2023, 11:00 PM
24
points
0
comments
7
min read
LW
link
[Question]
What is to be done? (About the profit motive)
Connor Barber
Sep 8, 2023, 7:27 PM
1
point
21
comments
1
min read
LW
link
What is the optimal frontier for due diligence?
RobertM
and
Ruby
Sep 8, 2023, 6:20 PM
41
points
1
comment
1
min read
LW
link
Progress links digest, 2023-09-08: The Conservative Futurist, cargo airships, and more
jasoncrawford
Sep 8, 2023, 5:48 PM
14
points
7
comments
5
min read
LW
link
(rootsofprogress.org)
The AI apocalypse myth.
Spiritus Dei
Sep 8, 2023, 5:43 PM
−22
points
12
comments
2
min read
LW
link
Sum-threshold attacks
TsviBT
Sep 8, 2023, 5:13 PM
238
points
55
comments
10
min read
LW
link
(tsvibt.blogspot.com)
Debate series: should we push for a pause on the development of AI?
Xodarap
Sep 8, 2023, 4:29 PM
39
points
1
comment
LW
link
AI Probability Trees—Joe Carlsmith (2022)
Nathan Young
Sep 8, 2023, 3:40 PM
12
points
1
comment
8
min read
LW
link
Invading Australia (Endless Formerlies Most Beautiful, or What I Learned On My Holiday)
Oliver Sourbut
Sep 8, 2023, 3:33 PM
12
points
1
comment
8
min read
LW
link
(www.oliversourbut.net)
Explaining grokking through circuit efficiency
Vikrant Varma
and
Rohin Shah
Sep 8, 2023, 2:39 PM
101
points
11
comments
3
min read
LW
link
(arxiv.org)
Have Attention Spans Been Declining?
niplav
Sep 8, 2023, 2:11 PM
71
points
22
comments
17
min read
LW
link
1
review
Explained Simply: Quantilizers
brook
Sep 8, 2023, 12:54 PM
15
points
5
comments
LW
link
(aisafetyexplained.substack.com)
Crossing the Rubicon.
Spiritus Dei
Sep 8, 2023, 4:19 AM
−4
points
5
comments
13
min read
LW
link
[Question]
What EY and LessWrong meant when (fill in the blank) found them.
Bill Benzon
Sep 8, 2023, 1:42 AM
1
point
0
comments
1
min read
LW
link
Bring back the Colosseums
lc
Sep 8, 2023, 12:09 AM
18
points
28
comments
1
min read
LW
link
The Löbian Obstacle, And Why You Should Care
lukemarks
Sep 7, 2023, 11:59 PM
18
points
6
comments
2
min read
LW
link
Science to Be Done Internationally Using Blockchain
Victor Porton
Sep 7, 2023, 11:29 PM
−18
points
0
comments
2
min read
LW
link
(science-dao.org)
A quick update from Nonlinear
KatWoods
Sep 7, 2023, 9:28 PM
72
points
23
comments
2
min read
LW
link
[Linkpost] Frontier AI Taskforce: first progress report
Paul Colognese
Sep 7, 2023, 7:06 PM
21
points
0
comments
4
min read
LW
link
(www.gov.uk)
[Question]
How did you make your way back from meta?
matto
Sep 7, 2023, 5:23 PM
23
points
27
comments
1
min read
LW
link
AI#28: Watching and Waiting
Zvi
Sep 7, 2023, 5:20 PM
52
points
14
comments
45
min read
LW
link
(thezvi.wordpress.com)
[Question]
Measure of complexity allowed by the laws of the universe and relative theory?
dr_s
Sep 7, 2023, 12:21 PM
8
points
22
comments
1
min read
LW
link
Recreating the caring drive
Catnee
Sep 7, 2023, 10:41 AM
43
points
15
comments
10
min read
LW
link
1
review
Sharing Information About Nonlinear
Ben Pace
Sep 7, 2023, 6:51 AM
323
points
323
comments
34
min read
LW
link
Weekly Incidence vs Cumulative Infections
jefftk
Sep 7, 2023, 2:30 AM
13
points
6
comments
1
min read
LW
link
(www.jefftk.com)
Improving Mathematical Accuracy in LLMs—History − 1
Abhay Chowdhry
Sep 7, 2023, 1:58 AM
5
points
1
comment
9
min read
LW
link
Breaking RLHF “Safety” (And how to fix it?)
MPotter
Sep 7, 2023, 1:58 AM
3
points
0
comments
4
min read
LW
link
Feedback-loops, Deliberate Practice, and Transfer Learning
Bird Concept
and
Raemon
Sep 7, 2023, 1:57 AM
46
points
5
comments
1
min read
LW
link
Video essay: How Will We Know When AI is Conscious?
JanPro
Sep 6, 2023, 6:10 PM
11
points
7
comments
1
min read
LW
link
(www.youtube.com)
My First Post
Jaivardhan Nawani
Sep 6, 2023, 5:42 PM
35
points
9
comments
1
min read
LW
link
ActAdd: Steering Language Models without Optimization
technicalities
,
TurnTrout
,
lisathiergart
,
David Udell
,
Ulisse Mini
and
Monte M
Sep 6, 2023, 5:21 PM
105
points
3
comments
2
min read
LW
link
(arxiv.org)
Monthly Roundup #10: September 2023
Zvi
Sep 6, 2023, 1:20 PM
35
points
4
comments
56
min read
LW
link
(thezvi.wordpress.com)
Find Hot French Food Near Me: A Follow-up
aphyer
Sep 6, 2023, 12:32 PM
75
points
19
comments
2
min read
LW
link
Manifest 2023
Saul Munn
and
Austin Chen
Sep 6, 2023, 11:24 AM
3
points
0
comments
1
min read
LW
link
Last Chance: Get tickets to Manifest 2023! (Sep 22-24 in Berkeley)
Saul Munn
and
Austin Chen
Sep 6, 2023, 10:35 AM
5
points
0
comments
1
min read
LW
link
What I’ve been reading, September 2023
jasoncrawford
Sep 6, 2023, 9:32 AM
17
points
0
comments
5
min read
LW
link
(rootsofprogress.org)
Decision Theory: A (Normative) Introduction
Pareto Optimal
Sep 6, 2023, 8:22 AM
−1
points
1
comment
3
min read
LW
link
(paretooptimal.substack.com)
[Question]
What’s the easiest way to make a luminator?
kuira
Sep 6, 2023, 12:07 AM
7
points
13
comments
1
min read
LW
link
Ordinary claims require ordinary evidence
blake8086
Sep 5, 2023, 10:09 PM
1
point
3
comments
2
min read
LW
link
Conversation about paradigms, intellectual progress, social consensus, and AI
Ruby
and
RobertM
Sep 5, 2023, 9:30 PM
14
points
6
comments
1
min read
LW
link
What I would do if I wasn’t at ARC Evals
LawrenceC
Sep 5, 2023, 7:19 PM
220
points
10
comments
13
min read
LW
link
1
review
The Evolutionary Pathway from Biological to Digital Intelligence: A Cosmic Perspective
George360
Sep 5, 2023, 5:47 PM
−17
points
0
comments
4
min read
LW
link
The Illusion of Universal Morality: A Dynamic Perspective on Genetic Fitness and Ethical Complexity
George360
Sep 5, 2023, 5:47 PM
−9
points
7
comments
2
min read
LW
link
Benchmarks for Detecting Measurement Tampering [Redwood Research]
ryan_greenblatt
and
Fabien Roger
Sep 5, 2023, 4:44 PM
87
points
22
comments
20
min read
LW
link
1
review
(arxiv.org)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel