Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Some thoughts on George Hotz vs Eliezer Yudkowsky
TristanTrim
Aug 15, 2023, 11:33 PM
10
points
3
comments
2
min read
LW
link
Understanding the Information Flow inside Large Language Models
Felix Hofstätter
and
cozyfractal
Aug 15, 2023, 9:13 PM
19
points
0
comments
17
min read
LW
link
[Question]
Any research in “probe-tuning” of LLMs?
Roman Leventov
Aug 15, 2023, 9:01 PM
20
points
3
comments
1
min read
LW
link
Can AI Transform the Electorate into a Citizen’s Assembly
RoscoHunter
Aug 15, 2023, 5:52 PM
−3
points
5
comments
3
min read
LW
link
Ten Thousand Years of Solitude
agp
Aug 15, 2023, 5:45 PM
137
points
19
comments
4
min read
LW
link
(www.discovermagazine.com)
AISN #19: US-China Competition on AI Chips, Measuring Language Agent Developments, Economic Analysis of Language Model Propaganda, and White House AI Cyber Challenge
Dan H
Aug 15, 2023, 4:10 PM
21
points
0
comments
5
min read
LW
link
(newsletter.safe.ai)
[Question]
What is the most effective anti-tyranny charity?
lc
Aug 15, 2023, 3:26 PM
20
points
10
comments
1
min read
LW
link
My checklist for publishing a blog post
Steven Byrnes
Aug 15, 2023, 3:04 PM
87
points
6
comments
3
min read
LW
link
The Dunbar Playbook: A CRM system for your friends
Severin T. Seehrich
Aug 15, 2023, 8:44 AM
32
points
16
comments
5
min read
LW
link
(amoretlicentia.substack.com)
Optical Illusions are Out of Distribution Errors
vitaliya
Aug 15, 2023, 2:23 AM
30
points
8
comments
2
min read
LW
link
A short calculation about a Twitter poll
Ege Erdil
Aug 14, 2023, 7:48 PM
64
points
64
comments
11
min read
LW
link
Decomposing independent generalizations in neural networks via Hessian analysis
Dmitry Vaintrob
and
Nina Panickssery
Aug 14, 2023, 5:04 PM
84
points
4
comments
1
min read
LW
link
Memetic Judo #2: Incorporal Switches and Levers Compendium
Max TK
Aug 14, 2023, 4:53 PM
19
points
6
comments
17
min read
LW
link
Existentially relevant thought experiment: To kill or not to kill, a sniper, a man and a button.
AlexFromSafeTransition
Aug 14, 2023, 10:53 AM
−18
points
6
comments
4
min read
LW
link
Stepping down as moderator on LW
Kaj_Sotala
Aug 14, 2023, 10:46 AM
82
points
1
comment
1
min read
LW
link
Announcing Manifest 2023 (Sep 22-24 in Berkeley)
Saul Munn
and
Austin Chen
Aug 14, 2023, 5:13 AM
31
points
0
comments
2
min read
LW
link
Coherence Therapy with LLMs—quick demo
Chris Lakin
Aug 14, 2023, 3:34 AM
19
points
11
comments
1
min read
LW
link
Listen For What You Don’t Hear: The Case for Contrarianism
Yashvardhan Sharma
Aug 14, 2023, 2:53 AM
1
point
1
comment
5
min read
LW
link
Recipe: Hessian eigenvector computation for PyTorch models
Nina Panickssery
Aug 14, 2023, 2:48 AM
32
points
5
comments
5
min read
LW
link
[Question]
Assuming LK99 or similar: how to accelerate commercialization?
ryan_b
Aug 13, 2023, 9:34 PM
7
points
5
comments
1
min read
LW
link
Twin Cities ACX Meetup September 2023
Timothy M.
Aug 13, 2023, 8:10 PM
1
point
4
comments
1
min read
LW
link
Fundamental Uncertainty: Chapter 1 - How can we know what’s true?
Gordon Seidoh Worley
Aug 13, 2023, 6:55 PM
17
points
4
comments
12
min read
LW
link
We Should Prepare for a Larger Representation of Academia in AI Safety
Leon Lang
Aug 13, 2023, 6:03 PM
90
points
14
comments
5
min read
LW
link
AGI is easier than robotaxis
Daniel Kokotajlo
Aug 13, 2023, 5:00 PM
41
points
30
comments
4
min read
LW
link
[Question]
If we’re alive in 5 years, do you think the funding situation will be much better by then? (With large amounts of government funding, for example)
kuira
Aug 13, 2023, 4:32 PM
−2
points
6
comments
1
min read
LW
link
Abstract Theories of Everything
Philosophistry
Aug 13, 2023, 6:06 AM
−17
points
0
comments
1
min read
LW
link
[Linkpost] Personal and Psychological Dimensions of AI Researchers Confronting AI Catastrophic Risks
Bogdan Ionut Cirstea
Aug 12, 2023, 10:02 PM
42
points
0
comments
1
min read
LW
link
The Empathy Engine: A Deconstruction of the Societal Metamorphosis through Technological Empathy Augmentation
bigdickproblems
Aug 12, 2023, 6:23 PM
−30
points
3
comments
2
min read
LW
link
The Benevolent Ruler’s Handbook (Part 2): Morality Rules
FCCC
Aug 12, 2023, 2:25 PM
5
points
0
comments
4
min read
LW
link
Learning as you play: anthropic shadow in deadly games
dr_s
Aug 12, 2023, 7:34 AM
37
points
28
comments
35
min read
LW
link
Biological Anchors: The Trick that Might or Might Not Work
Scott Alexander
Aug 12, 2023, 12:53 AM
91
points
3
comments
33
min read
LW
link
(astralcodexten.substack.com)
Simulate the CEO
robotelvis
Aug 12, 2023, 12:09 AM
23
points
5
comments
5
min read
LW
link
(messyprogress.substack.com)
How to decide under low-stakes uncertainty
dkl9
Aug 11, 2023, 6:07 PM
11
points
4
comments
1
min read
LW
link
(dkl9.net)
The Pandemic is Only Beginning: The Long COVID Disaster
salvatore mattera
Aug 11, 2023, 5:36 PM
−6
points
15
comments
8
min read
LW
link
When discussing AI risks, talk about capabilities, not intelligence
Vika
Aug 11, 2023, 1:38 PM
124
points
7
comments
3
min read
LW
link
(vkrakovna.wordpress.com)
What are the flaws in this AGI argument?
William the Kiwi
Aug 11, 2023, 11:31 AM
5
points
14
comments
1
min read
LW
link
Google DeepMind’s RT-2
SandXbox
Aug 11, 2023, 11:26 AM
9
points
1
comment
1
min read
LW
link
(robotics-transformer2.github.io)
Linkpost: We need another Expert Survey on Progress in AI, urgently
David Mears
Aug 11, 2023, 8:22 AM
25
points
2
comments
2
min read
LW
link
(open.substack.com)
What Does a Marginal Grant at LTFF Look Like? Funding Priorities and Grantmaking Thresholds at the Long-Term Future Fund
Linch
,
calebp99
and
Daniel_Eth
Aug 11, 2023, 3:59 AM
64
points
0
comments
1
min read
LW
link
(forum.effectivealtruism.org)
[Question]
Will posting any thread on LW guarantee that a LLM will index all my content, and if questions people ask to the LLM after my name will surface up all my LW content?
Alex K. Chen (parrot)
Aug 11, 2023, 1:40 AM
0
points
0
comments
1
min read
LW
link
AI Safety Concepts Writeup: WebGPT
JustisMills
Aug 11, 2023, 1:35 AM
9
points
1
comment
7
min read
LW
link
[Question]
What is science?
Adam Zerner
Aug 11, 2023, 12:00 AM
6
points
4
comments
1
min read
LW
link
Three configurable prettyprinters
philh
Aug 10, 2023, 11:10 PM
9
points
0
comments
22
min read
LW
link
(reasonableapproximation.net)
Ilya Sutskever’s thoughts on AI safety (July 2023): a transcript with my comments
mishka
Aug 10, 2023, 7:07 PM
21
points
3
comments
5
min read
LW
link
Seeking Input to AI Safety Book for non-technical audience
Darren McKee
Aug 10, 2023, 5:58 PM
10
points
4
comments
1
min read
LW
link
Evaluating GPT-4 Theory of Mind Capabilities
gcmac
and
Nathan
Aug 10, 2023, 5:57 PM
15
points
2
comments
14
min read
LW
link
Some alignment ideas
SelonNerias
Aug 10, 2023, 5:51 PM
1
point
0
comments
11
min read
LW
link
Self Supervised Learning (SSL)
Varshul Gupta
Aug 10, 2023, 5:43 PM
5
points
1
comment
2
min read
LW
link
(dubverseblack.substack.com)
Predicting Virus Relative Abundance in Wastewater
jefftk
Aug 10, 2023, 3:46 PM
33
points
2
comments
LW
link
(naobservatory.org)
AI #24: Week of the Podcast
Zvi
Aug 10, 2023, 3:00 PM
49
points
5
comments
44
min read
LW
link
(thezvi.wordpress.com)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel