Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
What’s going on with Per-Component Weight Updates?
4gate
Aug 22, 2024, 9:22 PM
1
point
0
comments
6
min read
LW
link
Interoperable High Level Structures: Early Thoughts on Adjectives
johnswentworth
and
David Lorell
Aug 22, 2024, 9:12 PM
49
points
1
comment
7
min read
LW
link
Interest poll: A time-waster blocker for desktop Linux programs
nahoj
Aug 22, 2024, 8:44 PM
4
points
5
comments
1
min read
LW
link
Turning 22 in the Pre-Apocalypse
testingthewaters
Aug 22, 2024, 8:28 PM
38
points
14
comments
24
min read
LW
link
(utilityhotbar.github.io)
A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed
johnswentworth
and
David Lorell
Aug 22, 2024, 7:19 PM
42
points
4
comments
4
min read
LW
link
what becoming more secure did for me
Chipmonk
Aug 22, 2024, 5:44 PM
26
points
5
comments
2
min read
LW
link
(chrislakin.blog)
A primer on the current state of longevity research
Abhishaike Mahajan
Aug 22, 2024, 5:14 PM
109
points
6
comments
14
min read
LW
link
(www.owlposting.com)
Some reasons to start a project to stop harmful AI
Remmelt
Aug 22, 2024, 4:23 PM
5
points
0
comments
2
min read
LW
link
The economics of space tethers
harsimony
Aug 22, 2024, 4:15 PM
67
points
22
comments
7
min read
LW
link
(splittinginfinity.substack.com)
Dima’s Shortform
Dmitrii Krasheninnikov
Aug 22, 2024, 2:49 PM
3
points
0
comments
1
min read
LW
link
AI #78: Some Welcome Calm
Zvi
Aug 22, 2024, 2:20 PM
61
points
15
comments
33
min read
LW
link
(thezvi.wordpress.com)
[Question]
How do we know dreams aren’t real?
Logan Zoellner
Aug 22, 2024, 12:41 PM
5
points
31
comments
1
min read
LW
link
Measuring Structure Development in Algorithmic Transformers
Micurie
and
Einar Urdshals
Aug 22, 2024, 8:38 AM
56
points
4
comments
11
min read
LW
link
Deception and Jailbreak Sequence: 1. Iterative Refinement Stages of Deception in LLMs
Winnie Yang
and
Jojo Yang
Aug 22, 2024, 7:32 AM
23
points
1
comment
21
min read
LW
link
Just because an LLM said it doesn’t mean it’s true: an illustrative example
dirk
Aug 21, 2024, 9:05 PM
26
points
12
comments
3
min read
LW
link
[Question]
How do you finish your tasks faster?
Cipolla
Aug 21, 2024, 8:01 PM
4
points
2
comments
1
min read
LW
link
AI Safety Newsletter #40: California AI Legislation Plus, NVIDIA Delays Chip Production, and Do AI Safety Benchmarks Actually Measure Safety?
Corin Katzke
,
Julius
,
Alexa Pan
and
Dan H
Aug 21, 2024, 6:09 PM
11
points
0
comments
6
min read
LW
link
(newsletter.safe.ai)
[Question]
Should LW suggest standard metaprompts?
Dagon
Aug 21, 2024, 4:41 PM
3
points
6
comments
1
min read
LW
link
Eternal Existence and Eternal Boredom: The Case for AI and Immortal Humans
Tuan Tu Nguyen
Aug 21, 2024, 9:58 AM
−12
points
2
comments
5
min read
LW
link
Please do not use AI to write for you
Richard_Kennaway
Aug 21, 2024, 9:53 AM
69
points
34
comments
4
min read
LW
link
Apply to Aether—Independent LLM Agent Safety Research Group
RohanS
Aug 21, 2024, 9:47 AM
10
points
0
comments
7
min read
LW
link
(forum.effectivealtruism.org)
the Giga Press was a mistake
bhauth
Aug 21, 2024, 4:51 AM
99
points
26
comments
5
min read
LW
link
(bhauth.com)
Exploring the Boundaries of Cognitohazards and the Nature of Reality
Victor Novikov
Aug 21, 2024, 3:42 AM
−2
points
2
comments
1
min read
LW
link
[Question]
What is the point of 2v2 debates?
Axel Ahlqvist
Aug 20, 2024, 9:59 PM
2
points
1
comment
1
min read
LW
link
[Question]
Where should I look for information on gut health?
FinalFormal2
Aug 20, 2024, 7:44 PM
10
points
10
comments
1
min read
LW
link
Would you benefit from, or object to, a page with LW users’ reacts?
Raemon
Aug 20, 2024, 4:35 PM
23
points
6
comments
1
min read
LW
link
Freedom of Speech
Zero Contradictions
Aug 20, 2024, 4:34 PM
−13
points
2
comments
2
min read
LW
link
(thewaywardaxolotl.blogspot.com)
AGI Safety and Alignment at Google DeepMind: A Summary of Recent Work
Rohin Shah
,
Seb Farquhar
and
Anca Dragan
Aug 20, 2024, 4:22 PM
222
points
33
comments
9
min read
LW
link
Trying to be rational for the wrong reasons
Viliam
Aug 20, 2024, 4:18 PM
26
points
9
comments
3
min read
LW
link
[Question]
How great is the utility of “saving” endangered languages?
SpectrumDT
Aug 20, 2024, 1:14 PM
18
points
29
comments
1
min read
LW
link
Guide to SB 1047
Zvi
Aug 20, 2024, 1:10 PM
71
points
18
comments
53
min read
LW
link
(thezvi.wordpress.com)
Finding Deception in Language Models
Esben Kran
and
Archana Vaidheeswaran
Aug 20, 2024, 9:42 AM
20
points
4
comments
4
min read
LW
link
Next automated reasoning grand challenge: CompCert
sanxiyn
Aug 20, 2024, 5:27 AM
−5
points
0
comments
1
min read
LW
link
Thiel on AI & Racing with China
Ben Pace
Aug 20, 2024, 3:19 AM
55
points
10
comments
12
min read
LW
link
Reflecting on the transhumanist rebuttal to AI existential risk and critique of our debate methodologies and misuse of statistics
catgirlsruletheworld
Aug 20, 2024, 1:59 AM
−5
points
0
comments
4
min read
LW
link
Artificial Intelligence and Eternal Torture and Suffering
Tuan Tu Nguyen
Aug 20, 2024, 1:53 AM
−1
points
0
comments
4
min read
LW
link
AI #77: A Few Upgrades
Zvi
Aug 20, 2024, 12:20 AM
23
points
3
comments
52
min read
LW
link
(thezvi.wordpress.com)
Monthly Roundup #21: August 2024
Zvi
Aug 20, 2024, 12:20 AM
22
points
6
comments
40
min read
LW
link
(thezvi.wordpress.com)
[Linkpost] Automated Design of Agentic Systems
Bogdan Ionut Cirstea
Aug 19, 2024, 11:06 PM
8
points
1
comment
1
min read
LW
link
(arxiv.org)
Limitations on Formal Verification for AI Safety
Andrew Dickson
Aug 19, 2024, 11:03 PM
134
points
60
comments
23
min read
LW
link
The Conscious River: Conscious Turing machines negate materialism
blallo
Aug 19, 2024, 9:54 PM
0
points
4
comments
7
min read
LW
link
LLM Applications I Want To See
sarahconstantin
Aug 19, 2024, 9:10 PM
102
points
6
comments
8
min read
LW
link
(sarahconstantin.substack.com)
Defining alignment research
Richard_Ngo
Aug 19, 2024, 8:42 PM
92
points
23
comments
7
min read
LW
link
Vilnius – ACX Meetups Everywhere Fall 2024
NoUsernameSelected
and
Mnephisto
Aug 19, 2024, 5:38 PM
3
points
1
comment
1
min read
LW
link
Can Current LLMs be Trusted To Produce Paperclips Safely?
Rohit Chatterjee
Aug 19, 2024, 5:17 PM
4
points
0
comments
9
min read
LW
link
A primer on why computational predictive toxicology is hard
Abhishaike Mahajan
Aug 19, 2024, 5:16 PM
63
points
2
comments
12
min read
LW
link
(www.owlposting.com)
Introduction and Exploration of AI Ethics Through a Global Lens
ThePathYouWillChoose
Aug 19, 2024, 5:11 PM
1
point
0
comments
1
min read
LW
link
Trustworthy and untrustworthy models
Olli Järviniemi
Aug 19, 2024, 4:27 PM
47
points
3
comments
8
min read
LW
link
Apartment Price Map Discontinuity
jefftk
Aug 19, 2024, 3:30 PM
12
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Will we ever run out of new jobs?
Kevin Kohler
Aug 19, 2024, 3:04 PM
17
points
7
comments
7
min read
LW
link
(machinocene.substack.com)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel