Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
What You Can Give Instead of Advice
Karl Faulks
Oct 24, 2024, 11:10 PM
13
points
2
comments
1
min read
LW
link
[Question]
is it possible to comment anonymously on a post?
KvmanThinking
Oct 24, 2024, 10:24 PM
2
points
2
comments
1
min read
LW
link
Logical Proof for the Emergence and Substrate Independence of Sentience
rife
Oct 24, 2024, 9:08 PM
4
points
31
comments
1
min read
LW
link
(awakenmoon.ai)
Against Job Boards: Human Capital and the Legibility Trap
vaishnav92
Oct 24, 2024, 8:50 PM
6
points
1
comment
5
min read
LW
link
IAPS: Mapping Technical Safety Research at AI Companies
Zach Stein-Perlman
Oct 24, 2024, 8:30 PM
42
points
13
comments
LW
link
(www.iaps.ai)
Our Digital and Biological Children
Eneasz
Oct 24, 2024, 6:36 PM
28
points
0
comments
3
min read
LW
link
(deathisbad.substack.com)
Reflections on the Metastrategies Workshop
gw
Oct 24, 2024, 6:30 PM
41
points
5
comments
11
min read
LW
link
How Should We Measure Intelligence Models: Why Use Frequency of Elemental Information Operations
hwj20
Oct 24, 2024, 4:54 PM
1
point
0
comments
5
min read
LW
link
Meta AI (FAIR) latest paper integrates system-1 and system-2 thinking into reasoning models.
happy friday
Oct 24, 2024, 4:54 PM
8
points
0
comments
1
min read
LW
link
Balancing Label Quantity and Quality for Scalable Elicitation
Alex Mallen
Oct 24, 2024, 4:49 PM
31
points
1
comment
2
min read
LW
link
Claude Sonnet 3.5.1 and Haiku 3.5
Zvi
Oct 24, 2024, 2:50 PM
51
points
9
comments
16
min read
LW
link
(thezvi.wordpress.com)
Big tech transitions are slow (with implications for AI)
jasoncrawford
Oct 24, 2024, 2:25 PM
36
points
16
comments
4
min read
LW
link
(blog.rootsofprogress.org)
Derivative AT a discontinuity
Alok Singh
Oct 24, 2024, 2:48 AM
9
points
5
comments
10
min read
LW
link
how to rapidly assimilate new information
dhruvmethi
Oct 24, 2024, 2:18 AM
9
points
3
comments
8
min read
LW
link
Ex-OpenAI researcher says OpenAI mass-violated copyright law
Remmelt
Oct 24, 2024, 1:00 AM
0
points
0
comments
LW
link
(suchir.net)
Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded
garrison
Oct 23, 2024, 11:40 PM
118
points
1
comment
7
min read
LW
link
(garrisonlovely.substack.com)
A metaphor: what “green lights” for AGI would look like
Lorec
Oct 23, 2024, 11:24 PM
−1
points
6
comments
2
min read
LW
link
Motte-and-Bailey: a Short Explanation
Lorec
Oct 23, 2024, 10:29 PM
12
points
0
comments
1
min read
LW
link
Self-prediction acts as an emergent regularizer
Cameron Berg
,
Judd Rosenblatt
,
Mike Vaiana
,
Diogo de Lucena
,
florin_pop
and
AE Studio
Oct 23, 2024, 10:27 PM
91
points
9
comments
4
min read
LW
link
Technical Risks of (Lethal) Autonomous Weapons Systems
Heramb
Oct 23, 2024, 8:41 PM
2
points
0
comments
1
min read
LW
link
(encodejustice.org)
Appealing to the Public
jefftk
Oct 23, 2024, 7:00 PM
16
points
0
comments
5
min read
LW
link
(www.jefftk.com)
Introducing Transluce — A Letter from the Founders
jsteinhardt
Oct 23, 2024, 6:10 PM
74
points
3
comments
3
min read
LW
link
(bounded-regret.ghost.io)
Are we dropping the ball on Recommendation AIs?
Charbel-Raphaël
Oct 23, 2024, 5:48 PM
41
points
17
comments
6
min read
LW
link
A bird’s eye view of ARC’s research
Jacob_Hilton
Oct 23, 2024, 3:50 PM
121
points
12
comments
7
min read
LW
link
(www.alignment.org)
[Question]
Artificial V/S Organoid Intelligence
10xyz
Oct 23, 2024, 2:31 PM
9
points
0
comments
1
min read
LW
link
AI safety tax dynamics
owencb
Oct 23, 2024, 12:18 PM
22
points
0
comments
6
min read
LW
link
(strangecities.substack.com)
What is malevolence? On the nature, measurement, and distribution of dark traits
David Althaus
,
Chi Nguyen
and
Clare
Oct 23, 2024, 8:41 AM
93
points
23
comments
LW
link
Join a LessWrong Team for the Unaging System Challenge
Crissman
Oct 23, 2024, 6:01 AM
15
points
5
comments
1
min read
LW
link
Word Spaghetti
Gordon Seidoh Worley
Oct 23, 2024, 5:39 AM
19
points
9
comments
3
min read
LW
link
Monosemanticity & Quantization
Rahul Chand
Oct 22, 2024, 10:57 PM
1
point
0
comments
9
min read
LW
link
[Question]
What is the alpha in one bit of evidence?
J Bostock
Oct 22, 2024, 9:57 PM
20
points
13
comments
1
min read
LW
link
Catastrophic sabotage as a major threat model for human-level AI systems
evhub
Oct 22, 2024, 8:57 PM
92
points
13
comments
15
min read
LW
link
Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)
Elizabeth
Oct 22, 2024, 6:20 PM
76
points
82
comments
1
min read
LW
link
(acesounderglass.com)
Decision-Making Under Uncertainty: Lessons From AI
Jonasb
Oct 22, 2024, 5:54 PM
−1
points
0
comments
5
min read
LW
link
(www.denominations.io)
Testing Genetic Engineering Detection with Spike-Ins
jefftk
Oct 22, 2024, 5:20 PM
9
points
0
comments
LW
link
(naobservatory.org)
Predictions as Public Works Project — What Metaculus Is Building Next
ChristianWilliams
Oct 22, 2024, 4:35 PM
5
points
0
comments
LW
link
(www.metaculus.com)
Gorges of gender on a terrain of traits
dkl9
Oct 22, 2024, 4:18 PM
−7
points
1
comment
3
min read
LW
link
(dkl9.net)
A Defense of Peer Review
Niko_McCarty
and
delton137
Oct 22, 2024, 4:16 PM
23
points
1
comment
22
min read
LW
link
(www.asimov.press)
BIG-Bench Canary Contamination in GPT-4
Jozdien
Oct 22, 2024, 3:40 PM
125
points
14
comments
4
min read
LW
link
[Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF
Leon Lang
Oct 22, 2024, 1:57 PM
51
points
2
comments
18
min read
LW
link
(arxiv.org)
[Intuitive self-models] 6. Awakening / Enlightenment / PNSE
Steven Byrnes
Oct 22, 2024, 1:23 PM
64
points
8
comments
21
min read
LW
link
Resolving von Neumann-Morgenstern Inconsistent Preferences
niplav
Oct 22, 2024, 11:45 AM
38
points
5
comments
58
min read
LW
link
Lenses of Control
WillPetillo
Oct 22, 2024, 7:51 AM
14
points
0
comments
9
min read
LW
link
A Brief Explanation of AI Control
Aaron_Scher
Oct 22, 2024, 7:00 AM
8
points
1
comment
6
min read
LW
link
Longevity, AI, and Cognitive Research Hackathon @ MIT
ekkolápto
Oct 22, 2024, 6:19 AM
1
point
0
comments
1
min read
LW
link
Conversational Signposts—How to stop having boring social interactions
Declan Molony
Oct 22, 2024, 5:37 AM
11
points
6
comments
2
min read
LW
link
I got dysentery so you don’t have to
eukaryote
Oct 22, 2024, 4:55 AM
321
points
6
comments
17
min read
LW
link
(eukaryotewritesblog.com)
Transformers Explained (Again)
RohanS
Oct 22, 2024, 4:06 AM
4
points
0
comments
18
min read
LW
link
Sleeping on Stage
jefftk
Oct 22, 2024, 12:50 AM
26
points
3
comments
1
min read
LW
link
(www.jefftk.com)
The Mask Comes Off: At What Price?
Zvi
Oct 21, 2024, 11:50 PM
72
points
16
comments
8
min read
LW
link
(thezvi.wordpress.com)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel