Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Taboo Truth
Tomás B.
Jul 8, 2023, 11:23 PM
36
points
16
comments
2
min read
LW
link
“View”
herschel
Jul 8, 2023, 11:19 PM
6
points
0
comments
2
min read
LW
link
[Question]
H5N1. Just how bad is the situation?
Q Home
Jul 8, 2023, 10:09 PM
16
points
8
comments
1
min read
LW
link
A Two-Part System for Practical Self-Care
Jonathan Moregård
Jul 8, 2023, 9:23 PM
11
points
0
comments
3
min read
LW
link
(honestliving.substack.com)
Really Strong Features Found in Residual Stream
Logan Riggs
Jul 8, 2023, 7:40 PM
69
points
6
comments
2
min read
LW
link
Eight Strategies for Tackling the Hard Part of the Alignment Problem
scasper
Jul 8, 2023, 6:55 PM
42
points
11
comments
7
min read
LW
link
“Concepts of Agency in Biology” (Okasha, 2023) - Brief Paper Summary
Nora_Ammann
Jul 8, 2023, 6:22 PM
40
points
3
comments
7
min read
LW
link
Blanchard’s Dangerous Idea and the Plight of the Lucid Crossdreamer
Zack_M_Davis
Jul 8, 2023, 6:03 PM
38
points
135
comments
72
min read
LW
link
(unremediatedgender.space)
Continuous Adversarial Quality Assurance: Extending RLHF and Constitutional AI
Benaya Koren
Jul 8, 2023, 5:32 PM
6
points
0
comments
9
min read
LW
link
Commentless downvoting is not a good way to fight infohazards
DirectedEvolution
Jul 8, 2023, 5:29 PM
6
points
9
comments
3
min read
LW
link
[Question]
Why does anxiety (?) make me dumb?
TeaTieAndHat
Jul 8, 2023, 4:13 PM
18
points
14
comments
3
min read
LW
link
Economic Time Bomb: An Overlooked Employment Bubble Threatening the US Economy
Glenn Clayton
Jul 8, 2023, 3:19 PM
4
points
10
comments
6
min read
LW
link
What is everyone doing in AI governance
Igor Ivanov
Jul 8, 2023, 3:16 PM
12
points
0
comments
5
min read
LW
link
LLM misalignment can probably be found without manual prompt engineering
ProgramCrafter
Jul 8, 2023, 2:35 PM
1
point
0
comments
1
min read
LW
link
You must not fool yourself, and you are the easiest person to fool
Richard_Ngo
Jul 8, 2023, 2:05 PM
35
points
5
comments
4
min read
LW
link
Fixed Point: a love story
Richard_Ngo
Jul 8, 2023, 1:56 PM
99
points
2
comments
7
min read
LW
link
Announcing AI Alignment workshop at the ALIFE 2023 conference
rorygreig
Jul 8, 2023, 1:52 PM
16
points
0
comments
1
min read
LW
link
(humanvaluesandartificialagency.com)
3D Printed Talkbox Cap
jefftk
Jul 8, 2023, 1:00 PM
9
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Writing this post as rationality case study
Ben Amitay
Jul 8, 2023, 12:24 PM
10
points
8
comments
2
min read
LW
link
[Question]
What Does LessWrong/EA Think of Human Intelligence Augmentation as of mid-2023?
lukemarks
Jul 8, 2023, 11:42 AM
84
points
28
comments
2
min read
LW
link
[Question]
Request for feedback—infohazards in testing LLMs for causal reasoning?
DirectedEvolution
Jul 8, 2023, 9:01 AM
16
points
0
comments
2
min read
LW
link
Views on when AGI comes and on strategy to reduce existential risk
TsviBT
Jul 8, 2023, 9:00 AM
133
points
61
comments
14
min read
LW
link
1
review
Weekday Evening Beach Picnics
jefftk
Jul 8, 2023, 2:20 AM
2
points
4
comments
1
min read
LW
link
(www.jefftk.com)
ACI#4: Seed AI is the new Perpetual Motion Machine
Akira Pyinya
Jul 8, 2023, 1:17 AM
−1
points
0
comments
6
min read
LW
link
[Question]
Links to discussions on social equilibrium and human value after (aligned) super-AI?
Michael Tontchev
Jul 8, 2023, 1:01 AM
7
points
3
comments
1
min read
LW
link
Notes from the Qatar Center for Global Banking and Finance 3rd Annual Conference
PixelatedPenguin
Jul 7, 2023, 11:48 PM
2
points
0
comments
1
min read
LW
link
Introducing bayescalc.io
Adele Lopez
Jul 7, 2023, 4:11 PM
115
points
29
comments
1
min read
LW
link
(bayescalc.io)
Meetup Tip: Ask Attendees To Explain It
Screwtape
Jul 7, 2023, 4:08 PM
10
points
0
comments
4
min read
LW
link
Interpreting Modular Addition in MLPs
Bart Bussmann
Jul 7, 2023, 9:22 AM
20
points
0
comments
6
min read
LW
link
Internal independent review for language model agent alignment
Seth Herd
Jul 7, 2023, 6:54 AM
55
points
30
comments
11
min read
LW
link
[Question]
Can LessWrong provide me with something I find obviously highly useful to my own practical life?
agrippa
Jul 7, 2023, 3:08 AM
32
points
4
comments
1
min read
LW
link
ask me about technology
bhauth
Jul 7, 2023, 2:03 AM
23
points
42
comments
1
min read
LW
link
Apparently, of the 195 Million the DoD allocated in University Research Funding Awards in 2022, more than half of them concerned AI or compute hardware research
mako yass
Jul 7, 2023, 1:20 AM
41
points
5
comments
2
min read
LW
link
(www.defense.gov)
What are the best non-LW places to read on alignment progress?
Raemon
Jul 7, 2023, 12:57 AM
50
points
14
comments
1
min read
LW
link
Two paths to win the AGI transition
Nathan Helm-Burger
Jul 6, 2023, 9:59 PM
11
points
8
comments
4
min read
LW
link
Empirical Evidence Against “The Longest Training Run”
NickGabs
Jul 6, 2023, 6:32 PM
31
points
0
comments
14
min read
LW
link
Progress Studies Fellowship looking for members
jay ram
Jul 6, 2023, 5:41 PM
3
points
0
comments
1
min read
LW
link
BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?
Peter Berggren
Jul 6, 2023, 5:32 PM
18
points
6
comments
2
min read
LW
link
Layering and Technical Debt in the Global Wayfinding Model
herschel
Jul 6, 2023, 5:30 PM
14
points
0
comments
3
min read
LW
link
Localizing goal misgeneralization in a maze-solving policy network
Jan Betley
Jul 6, 2023, 4:21 PM
37
points
2
comments
7
min read
LW
link
Jesse Hoogland on Developmental Interpretability and Singular Learning Theory
Michaël Trazzi
Jul 6, 2023, 3:46 PM
42
points
2
comments
4
min read
LW
link
(theinsideview.ai)
Progress links and tweets, 2023-07-06: Terraformer Mark One, Israeli water management, & more
jasoncrawford
Jul 6, 2023, 3:35 PM
18
points
4
comments
2
min read
LW
link
(rootsofprogress.org)
Towards Non-Panopticon AI Alignment
Logan Zoellner
Jul 6, 2023, 3:29 PM
7
points
0
comments
3
min read
LW
link
A Defense of Work on Mathematical AI Safety
Davidmanheim
6 Jul 2023 14:15 UTC
28
points
13
comments
3
min read
LW
link
(forum.effectivealtruism.org)
Understanding the two most common mental health problems in the world
spencerg
6 Jul 2023 14:06 UTC
19
points
0
comments
LW
link
Announcing the EA Archive
Aaron Bergman
6 Jul 2023 13:49 UTC
13
points
2
comments
LW
link
Agency begets agency
Richard_Ngo
6 Jul 2023 13:08 UTC
60
points
1
comment
4
min read
LW
link
AI #19: Hofstadter, Sutskever, Leike
Zvi
6 Jul 2023 12:50 UTC
60
points
16
comments
40
min read
LW
link
(thezvi.wordpress.com)
Do you feel that AGI Alignment could be achieved in a Type 0 civilization?
Super AGI
6 Jul 2023 4:52 UTC
−2
points
1
comment
1
min read
LW
link
Open Thread—July 2023
Ruby
6 Jul 2023 4:50 UTC
11
points
35
comments
1
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel