Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers
hugofry
Apr 29, 2024, 8:57 PM
94
points
8
comments
11
min read
LW
link
Towards a formalization of the agent structure problem
Alex_Altair
Apr 29, 2024, 8:28 PM
55
points
6
comments
14
min read
LW
link
Ironing Out the Squiggles
Zack_M_Davis
Apr 29, 2024, 4:13 PM
157
points
36
comments
11
min read
LW
link
Super additivity of consciousness
Arturo Macias
Apr 29, 2024, 3:41 PM
−2
points
13
comments
2
min read
LW
link
AISC9 has ended and there will be an AISC10
Linda Linsefors
Apr 29, 2024, 10:53 AM
75
points
4
comments
2
min read
LW
link
Open-Source AI: A Regulatory Review
Elliot Mckernon
and
Deric Cheng
Apr 29, 2024, 10:10 AM
18
points
0
comments
8
min read
LW
link
Big-endian is better than little-endian
Menotim
Apr 29, 2024, 2:30 AM
29
points
17
comments
3
min read
LW
link
The Prop-room and Stage Cognitive Architecture
Robert Kralisch
Apr 29, 2024, 12:48 AM
14
points
4
comments
14
min read
LW
link
How are Simulators and Agents related?
Robert Kralisch
Apr 29, 2024, 12:22 AM
6
points
0
comments
7
min read
LW
link
Extended Embodiment
Robert Kralisch
Apr 29, 2024, 12:18 AM
8
points
1
comment
3
min read
LW
link
Referential Containment
Robert Kralisch
Apr 29, 2024, 12:16 AM
2
points
4
comments
3
min read
LW
link
Disentangling Competence and Intelligence
Robert Kralisch
Apr 29, 2024, 12:12 AM
23
points
7
comments
6
min read
LW
link
List your AI X-Risk cruxes!
Aryeh Englander
Apr 28, 2024, 6:26 PM
42
points
7
comments
2
min read
LW
link
Things I tell myself to be more agentic
DMMF
Apr 28, 2024, 5:44 PM
9
points
0
comments
3
min read
LW
link
(danfrank.ca)
Estimating the Number of Players from Game Result Percentages
Daniel L
Apr 28, 2024, 5:42 PM
1
point
2
comments
1
min read
LW
link
The Science Algorithm—AISC 2024 Final Presentation
Johannes C. Mayer
Apr 28, 2024, 2:55 PM
4
points
0
comments
1
min read
LW
link
(www.youtube.com)
[Aspiration-based designs] Outlook: dealing with complexity
Jobst Heitzig
,
jossoliver
,
thomasfinn
and
Simon Dima
Apr 28, 2024, 1:06 PM
13
points
3
comments
2
min read
LW
link
[Aspiration-based designs] 3. Performance and safety criteria, and aspiration intervals
Jobst Heitzig
Apr 28, 2024, 1:04 PM
10
points
0
comments
12
min read
LW
link
[Aspiration-based designs] 2. Formal framework, basic algorithm
Jobst Heitzig
,
Simon Dima
and
Simon Fischer
Apr 28, 2024, 1:02 PM
18
points
2
comments
16
min read
LW
link
[Aspiration-based designs] 1. Informal introduction
B Jacobs
,
Jobst Heitzig
,
Simon Fischer
and
Simon Dima
Apr 28, 2024, 1:00 PM
44
points
4
comments
8
min read
LW
link
Playing Northboro with Lily and Rick
jefftk
Apr 28, 2024, 2:40 AM
10
points
1
comment
2
min read
LW
link
(www.jefftk.com)
Release of UN’s draft related to the governance of AI (a summary of the Simon Institute’s response)
Sebastian Schmidt
Apr 27, 2024, 6:34 PM
7
points
0
comments
1
min read
LW
link
(forum.effectivealtruism.org)
Mercy to the Machine: Thoughts & Rights
False Name
Apr 27, 2024, 4:36 PM
7
points
6
comments
17
min read
LW
link
Constructability: Plainly-coded AGIs may be feasible in the near future
Épiphanie Gédéon
and
Charbel-Raphaël
Apr 27, 2024, 4:04 PM
91
points
15
comments
13
min read
LW
link
So What’s Up With PUFAs Chemically?
J Bostock
Apr 27, 2024, 1:32 PM
57
points
23
comments
6
min read
LW
link
Link: Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models by Jacob Pfau, William Merrill & Samuel R. Bowman
Chris_Leong
Apr 27, 2024, 1:22 PM
12
points
0
comments
1
min read
LW
link
(twitter.com)
Two Vernor Vinge Book Reviews
Maxwell Tabarrok
Apr 27, 2024, 12:14 PM
17
points
0
comments
2
min read
LW
link
(www.maximum-progress.com)
Refusal in LLMs is mediated by a single direction
Andy Arditi
,
Oscar Obeso
,
Aaquib111
,
wesg
and
Neel Nanda
Apr 27, 2024, 11:13 AM
247
points
95
comments
10
min read
LW
link
[Question]
Plausibility of Getting Early Warning Shots because AIs can’t coordinate?
hmys
Apr 27, 2024, 8:02 AM
12
points
0
comments
1
min read
LW
link
AI Safety Sphere
Myles H
Apr 27, 2024, 1:49 AM
6
points
2
comments
2
min read
LW
link
Exploring the Esoteric Pathways to AI Sentience (Part One)
jeffreycaruso
Apr 27, 2024, 1:02 AM
−11
points
6
comments
2
min read
LW
link
Superposition is not “just” neuron polysemanticity
LawrenceC
Apr 26, 2024, 11:22 PM
66
points
4
comments
13
min read
LW
link
D&D.Sci Long War: Defender of Data-mocracy
aphyer
Apr 26, 2024, 10:30 PM
44
points
20
comments
4
min read
LW
link
On Not Pulling The Ladder Up Behind You
Screwtape
Apr 26, 2024, 9:58 PM
189
points
21
comments
9
min read
LW
link
We are headed into an extreme compute overhang
devrandom
Apr 26, 2024, 9:38 PM
54
points
34
comments
2
min read
LW
link
[Concept Dependency] Edge Regular Lattice Graph
Johannes C. Mayer
Apr 26, 2024, 9:14 PM
9
points
1
comment
1
min read
LW
link
[Concept Dependency] Concept Dependency Posts
Johannes C. Mayer
Apr 26, 2024, 8:57 PM
10
points
3
comments
2
min read
LW
link
[Question]
Wouldn’t weak AI agents provide warning?
Mandatory Topic
Apr 26, 2024, 7:34 PM
5
points
0
comments
1
min read
LW
link
World models
A*
Apr 26, 2024, 7:11 PM
1
point
0
comments
1
min read
LW
link
Duct Tape security
Isaac King
Apr 26, 2024, 6:57 PM
69
points
11
comments
5
min read
LW
link
Fundamental Uncertainty: Chapter 8 - When does fundamental uncertainty matter?
Gordon Seidoh Worley
Apr 26, 2024, 6:10 PM
11
points
2
comments
32
min read
LW
link
Scaling of AI training runs will slow down after GPT-5
Maxime Riché
Apr 26, 2024, 4:05 PM
40
points
5
comments
3
min read
LW
link
Spatial attention as a “tell” for empathetic simulation?
Steven Byrnes
Apr 26, 2024, 3:10 PM
55
points
12
comments
8
min read
LW
link
Arch-anarchy
Peter lawless
Apr 26, 2024, 3:05 PM
−1
points
1
comment
25
min read
LW
link
Breadboarding a Whistle Synth
jefftk
Apr 26, 2024, 3:00 PM
9
points
2
comments
2
min read
LW
link
(www.jefftk.com)
An Introduction to AI Sandbagging
Teun van der Weij
,
Felix Hofstätter
and
Francis Rhys Ward
Apr 26, 2024, 1:40 PM
46
points
13
comments
8
min read
LW
link
LLMs seem (relatively) safe
JustisMills
Apr 25, 2024, 10:13 PM
53
points
24
comments
7
min read
LW
link
(justismills.substack.com)
Losing Faith In Contrarianism
Bentham's Bulldog
Apr 25, 2024, 8:53 PM
39
points
44
comments
5
min read
LW
link
Why I stopped being into basin broadness
tailcalled
Apr 25, 2024, 8:47 PM
16
points
3
comments
2
min read
LW
link
AXRP Episode 29 - Science of Deep Learning with Vikrant Varma
DanielFilan
Apr 25, 2024, 7:10 PM
20
points
1
comment
63
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel