Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Toki pona FAQ
dkl9
Mar 17, 2024, 9:44 PM
37
points
9
comments
1
min read
LW
link
(dkl9.net)
EA ErFiN Project work
Max_He-Ho
Mar 17, 2024, 8:42 PM
2
points
0
comments
1
min read
LW
link
EA ErFiN Project work
Max_He-Ho
Mar 17, 2024, 8:37 PM
2
points
0
comments
1
min read
LW
link
[Question]
Alice and Bob is debating on a technique. Alice says Bob should try it before denying it. Is it a fallacy or something similar?
Ooker
Mar 17, 2024, 8:01 PM
0
points
19
comments
2
min read
LW
link
Is there a way to calculate the P(we are in a 2nd cold war)?
cloak
Mar 17, 2024, 8:01 PM
−9
points
2
comments
1
min read
LW
link
The Worst Form Of Government (Except For Everything Else We’ve Tried)
johnswentworth
Mar 17, 2024, 6:11 PM
135
points
47
comments
4
min read
LW
link
Applying simulacrum levels to hobbies, interests and goals
DMMF
Mar 17, 2024, 4:18 PM
15
points
2
comments
4
min read
LW
link
(danfrank.ca)
What is the best argument that LLMs are shoggoths?
JoshuaFox
Mar 17, 2024, 11:36 AM
26
points
22
comments
1
min read
LW
link
Invitation to the Princeton AI Alignment and Safety Seminar
Sadhika Malladi
Mar 17, 2024, 1:10 AM
6
points
1
comment
1
min read
LW
link
Anxiety vs. Depression
Sable
Mar 17, 2024, 12:15 AM
86
points
35
comments
3
min read
LW
link
(affablyevil.substack.com)
Celiefs
TheLemmaLlama
Mar 16, 2024, 11:56 PM
3
points
8
comments
1
min read
LW
link
My PhD thesis: Algorithmic Bayesian Epistemology
Eric Neyman
Mar 16, 2024, 10:56 PM
262
points
14
comments
7
min read
LW
link
(arxiv.org)
How people stopped dying from diarrhea so much (& other life-saving decisions)
Writer
Mar 16, 2024, 4:00 PM
45
points
0
comments
LW
link
(youtu.be)
Transformative trustbuilding via advancements in decentralized lie detection
trevor
Mar 16, 2024, 5:56 AM
20
points
10
comments
38
min read
LW
link
(www.ncbi.nlm.nih.gov)
Enter the WorldsEnd
Akram Choudhary
Mar 16, 2024, 1:34 AM
−25
points
8
comments
1
min read
LW
link
Strong-Misalignment: Does Yudkowsky (or Christiano, or TurnTrout, or Wolfram, or…etc.) Have an Elevator Speech I’m Missing?
Benjamin Bourlier
Mar 15, 2024, 11:17 PM
−4
points
3
comments
16
min read
LW
link
Introducing METR’s Autonomy Evaluation Resources
Megan Kinniment
and
Beth Barnes
Mar 15, 2024, 11:16 PM
90
points
0
comments
1
min read
LW
link
(metr.github.io)
Are AIs conscious? It might depend
Logan Zoellner
Mar 15, 2024, 11:09 PM
6
points
6
comments
3
min read
LW
link
Beyond Maxipok — good reflective governance as a target for action
owencb
Mar 15, 2024, 10:22 PM
20
points
0
comments
LW
link
Middle Child Phenomenon
PhilosophicalSoul
Mar 15, 2024, 8:47 PM
3
points
3
comments
2
min read
LW
link
Capability or Alignment? Respect the LLM Base Model’s Capability During Alignment
Jingfeng Yang
Mar 15, 2024, 5:56 PM
7
points
0
comments
24
min read
LW
link
Rational Animations offers animation production and writing services!
Writer
Mar 15, 2024, 5:26 PM
33
points
0
comments
1
min read
LW
link
Improving SAE’s by Sqrt()-ing L1 & Removing Lowest Activating Features
Logan Riggs
and
Jannik Brinkmann
Mar 15, 2024, 4:30 PM
26
points
5
comments
4
min read
LW
link
Stuttgart, Germany—ACX Spring Meetups Everywhere 2024
Benjamin R
Mar 15, 2024, 2:59 PM
2
points
1
comment
1
min read
LW
link
Controlling AGI Risk
TeaSea
Mar 15, 2024, 4:56 AM
6
points
8
comments
4
min read
LW
link
Ulm, Germany—ACX Spring Meetups Everywhere 2024
Benjamin R
Mar 15, 2024, 1:32 AM
2
points
1
comment
1
min read
LW
link
Newport News/ Virginia ACX Meetup
Daniel
Mar 14, 2024, 11:46 PM
1
point
0
comments
1
min read
LW
link
Constructive Cauchy sequences vs. Dedekind cuts
jessicata
Mar 14, 2024, 11:04 PM
47
points
23
comments
4
min read
LW
link
(unstableontology.com)
A Nail in the Coffin of Exceptionalism
Yeshua God
Mar 14, 2024, 10:41 PM
−17
points
0
comments
3
min read
LW
link
Toward a Broader Conception of Adverse Selection
Ricki Heicklen
Mar 14, 2024, 10:40 PM
177
points
61
comments
13
min read
LW
link
(bayesshammai.substack.com)
More people getting into AI safety should do a PhD
AdamGleave
Mar 14, 2024, 10:14 PM
61
points
24
comments
12
min read
LW
link
(gleave.me)
Collection (Part 6 of “The Sense Of Physical Necessity”)
LoganStrohl
Mar 14, 2024, 9:37 PM
28
points
0
comments
8
min read
LW
link
Fixed point or oscillate or noise
lemonhope
Mar 14, 2024, 6:37 PM
3
points
10
comments
1
min read
LW
link
How useful is “AI Control” as a framing on AI X-Risk?
habryka
and
ryan_greenblatt
Mar 14, 2024, 6:06 PM
70
points
4
comments
34
min read
LW
link
Sparse autoencoders find composed features in small toy models
Evan Anders
,
Clement Neo
,
Jason Hoelscher-Obermaier
and
Jessica N. Howard
Mar 14, 2024, 6:00 PM
33
points
12
comments
15
min read
LW
link
AI #55: Keep Clauding Along
Zvi
Mar 14, 2024, 3:40 PM
62
points
16
comments
70
min read
LW
link
(thezvi.wordpress.com)
To the average human, controlled AI is just as lethal as ‘misaligned’ AI
YonatanK
Mar 14, 2024, 2:52 PM
6
points
20
comments
5
min read
LW
link
Claude vs GPT
Maxwell Tabarrok
Mar 14, 2024, 12:41 PM
12
points
2
comments
2
min read
LW
link
(www.maximum-progress.com)
A brief review of China’s AI industry and regulations
Elliot Mckernon
Mar 14, 2024, 12:19 PM
24
points
0
comments
16
min read
LW
link
[Question]
Can any LLM be represented as an Equation?
Valentin Baltadzhiev
Mar 14, 2024, 9:51 AM
1
point
2
comments
1
min read
LW
link
‘Empiricism!’ as Anti-Epistemology
Eliezer Yudkowsky
Mar 14, 2024, 2:02 AM
171
points
92
comments
25
min read
LW
link
How I turned doing therapy into object-level AI safety research
Chipmonk
Mar 14, 2024, 1:54 AM
15
points
5
comments
4
min read
LW
link
Opportunistic Time-Management
Richard Henage
Mar 13, 2024, 9:38 PM
13
points
2
comments
1
min read
LW
link
AI governance and strategy: a list of research agendas and work that could be done.
NathanBarnard
and
Erin Robertson
13 Mar 2024 21:23 UTC
7
points
1
comment
17
min read
LW
link
Highlights from Lex Fridman’s interview of Yann LeCun
Joel Burget
13 Mar 2024 20:58 UTC
48
points
15
comments
41
min read
LW
link
On the Latest TikTok Bill
Zvi
13 Mar 2024 18:50 UTC
58
points
7
comments
29
min read
LW
link
(thezvi.wordpress.com)
[Question]
Recommended book for a balanced take and lessons learned from covid pandemic response
Martin Hare Robertson
13 Mar 2024 18:14 UTC
4
points
0
comments
1
min read
LW
link
ACX/LW Seattle spring meetup 2024
nsokolsky
13 Mar 2024 17:24 UTC
12
points
3
comments
1
min read
LW
link
Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems
Sonia Joseph
and
Neel Nanda
13 Mar 2024 17:09 UTC
44
points
13
comments
14
min read
LW
link
I was raised by devout Mormons, AMA [&|] Soliciting Advice
ErioirE
13 Mar 2024 16:52 UTC
31
points
41
comments
2
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel