Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
Balsa Update and General Thank You
Zvi
Dec 12, 2023, 8:30 PM
61
points
8
comments
8
min read
LW
link
(thezvi.wordpress.com)
Towards an Ethics Calculator for Use by an AGI
sweenesm
Dec 12, 2023, 6:37 PM
3
points
2
comments
11
min read
LW
link
Why Psychologists Are Wrong About The Illusion Of Explanatory Depth
moses onyedikachukwu
Dec 12, 2023, 6:32 PM
1
point
0
comments
4
min read
LW
link
A design concept for superintelligent machines (and Popper’s critique of induction)
tiplur-bilrex
Dec 12, 2023, 6:31 PM
−7
points
6
comments
1
min read
LW
link
(tiplur-bilrex.tlon.network)
Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible
GeneSmith
and
kman
Dec 12, 2023, 6:14 PM
459
points
206
comments
33
min read
LW
link
2
reviews
[Question]
Why No Automated Plagerism Detection For Past Papers?
Lao Mein
Dec 12, 2023, 5:24 PM
7
points
10
comments
1
min read
LW
link
OpenAI: Leaks Confirm the Story
Zvi
Dec 12, 2023, 2:00 PM
77
points
9
comments
16
min read
LW
link
(thezvi.wordpress.com)
Navigating the Attackspace
Jonas Kgomo
Dec 12, 2023, 1:59 PM
1
point
0
comments
2
min read
LW
link
Nonlinear’s Evidence: Debunking False and Misleading Claims
KatWoods
Dec 12, 2023, 1:16 PM
104
points
171
comments
LW
link
AI Institution Design Hackathon (EAG Bay Area Satellite Event)
beatrice@foresight.org
and
Allison Duettmann
Dec 12, 2023, 1:10 PM
1
point
0
comments
1
min read
LW
link
Funding case: AI Safety Camp 10
Remmelt
and
Linda Linsefors
Dec 12, 2023, 9:08 AM
66
points
5
comments
6
min read
LW
link
(manifund.org)
What is the next level of rationality?
lsusr
and
Yoav Ravid
Dec 12, 2023, 8:14 AM
48
points
24
comments
7
min read
LW
link
Embedded Agents are Quines
lsusr
and
DaemonicSigil
Dec 12, 2023, 4:57 AM
11
points
7
comments
8
min read
LW
link
Predict the future! Earn fake internet points! Get a (free) gambling addiction!
Robert Cousineau
Dec 12, 2023, 4:39 AM
3
points
0
comments
1
min read
LW
link
The likely first longevity drug is based on sketchy science. This is bad for science and bad for longevity.
BobBurgers
Dec 12, 2023, 2:42 AM
161
points
34
comments
5
min read
LW
link
When will GPT-5 come out? Prediction markets vs. Extrapolation
Malte
Dec 12, 2023, 2:41 AM
12
points
9
comments
3
min read
LW
link
On plans for a functional society
kave
and
Vaniver
Dec 12, 2023, 12:07 AM
41
points
8
comments
13
min read
LW
link
Secondary Risk Markets
Vaniver
Dec 11, 2023, 9:52 PM
35
points
4
comments
4
min read
LW
link
Has anyone experimented with Dodrio, a tool for exploring transformer models through interactive visualization?
Bill Benzon
Dec 11, 2023, 8:34 PM
4
points
0
comments
1
min read
LW
link
[Valence series] 3. Valence & Beliefs
Steven Byrnes
Dec 11, 2023, 8:21 PM
77
points
12
comments
21
min read
LW
link
1
review
[Question]
Am I ethically obligated to extend the life of my dog with life-extension treatments about to hit the market?
TrudosKudos
Dec 11, 2023, 7:41 PM
−3
points
2
comments
1
min read
LW
link
Adversarial Robustness Could Help Prevent Catastrophic Misuse
aog
Dec 11, 2023, 7:12 PM
30
points
18
comments
9
min read
LW
link
The Consciousness Box
GradualImprovement
Dec 11, 2023, 4:45 PM
33
points
24
comments
4
min read
LW
link
Empirical work that might shed light on scheming (Section 6 of “Scheming AIs”)
Joe Carlsmith
Dec 11, 2023, 4:30 PM
8
points
0
comments
21
min read
LW
link
Into AI Safety: Episode 3
jacobhaimes
Dec 11, 2023, 4:30 PM
6
points
0
comments
1
min read
LW
link
(into-ai-safety.github.io)
Implicitly Typed C
jefftk
Dec 11, 2023, 4:10 PM
16
points
0
comments
1
min read
LW
link
(www.jefftk.com)
37C3 Hacker x Rationalist Meetup
Kiboneu
and
ctrltab
Dec 11, 2023, 4:02 PM
5
points
5
comments
1
min read
LW
link
re: Yudkowsky on biological materials
bhauth
Dec 11, 2023, 1:28 PM
182
points
30
comments
5
min read
LW
link
Ideoculture
elv
Dec 11, 2023, 10:29 AM
8
points
2
comments
6
min read
LW
link
Quick thoughts on the implications of multi-agent views of mind on AI takeover
Kaj_Sotala
Dec 11, 2023, 6:34 AM
47
points
14
comments
4
min read
LW
link
Auditing failures vs concentrated failures
ryan_greenblatt
and
Fabien Roger
Dec 11, 2023, 2:47 AM
47
points
1
comment
7
min read
LW
link
1
review
Deeply Cover Car Crashes?
jefftk
Dec 10, 2023, 10:20 PM
36
points
32
comments
1
min read
LW
link
(www.jefftk.com)
Principles For Product Liability (With Application To AI)
johnswentworth
Dec 10, 2023, 9:27 PM
37
points
55
comments
10
min read
LW
link
[Question]
What do you do to remember and reference the LessWrong posts that were most personally significant to you, in terms of intellectual development or general usefulness?
lillybaeum
Dec 10, 2023, 5:52 PM
5
points
7
comments
1
min read
LW
link
[Question]
Do websites and apps actually generally get worse after updates, or is it just an effect of the fear of change?
lillybaeum
Dec 10, 2023, 5:26 PM
36
points
35
comments
2
min read
LW
link
1
review
How LDT helps reduce the AI arms race
Tamsin Leake
Dec 10, 2023, 4:21 PM
65
points
13
comments
4
min read
LW
link
(carado.moe)
Understanding Subjective Probabilities
Isaac King
Dec 10, 2023, 6:03 AM
31
points
16
comments
10
min read
LW
link
Send us example gnarly bugs
Beth Barnes
,
Megan Kinniment
and
Tao Lin
Dec 10, 2023, 5:23 AM
77
points
10
comments
2
min read
LW
link
Conceptual coherence for concrete categories in humans and LLMs
Bill Benzon
Dec 9, 2023, 11:49 PM
13
points
1
comment
2
min read
LW
link
2d ai-partners as a comprehensive motivation tool
AiresJL
Dec 9, 2023, 9:59 PM
3
points
0
comments
1
min read
LW
link
Without—MicroFiction 250 words
Carissa Cassiel
Dec 9, 2023, 9:49 PM
20
points
1
comment
1
min read
LW
link
Some negative steganography results
Fabien Roger
Dec 9, 2023, 8:22 PM
60
points
5
comments
2
min read
LW
link
Summing up “Scheming AIs” (Section 5)
Joe Carlsmith
9 Dec 2023 15:48 UTC
2
points
1
comment
11
min read
LW
link
The Offense-Defense Balance Rarely Changes
Maxwell Tabarrok
9 Dec 2023 15:21 UTC
77
points
23
comments
3
min read
LW
link
(maximumprogress.substack.com)
A Philosophical Tautology
Nox ML
9 Dec 2023 14:06 UTC
−2
points
45
comments
2
min read
LW
link
Unpicking Extinction
ukc10014
9 Dec 2023 9:15 UTC
35
points
10
comments
10
min read
LW
link
Finding Sparse Linear Connections between Features in LLMs
Logan Riggs
,
Sam Mitchell
and
Adam Kaufman
9 Dec 2023 2:27 UTC
70
points
5
comments
10
min read
LW
link
[Question]
Option Space Nomenclature
SilverFlame
8 Dec 2023 23:14 UTC
1
point
0
comments
1
min read
LW
link
“Model UN Solutions”
Arjun Panickssery
8 Dec 2023 23:06 UTC
36
points
5
comments
1
min read
LW
link
(open.substack.com)
Speed arguments against scheming (Section 4.4-4.7 of “Scheming AIs”)
Joe Carlsmith
8 Dec 2023 21:09 UTC
9
points
0
comments
15
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel