Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Arch-anarchy:Theory and practice
Peter lawless
Apr 30, 2024, 11:20 PM
−6
points
0
comments
2
min read
LW
link
“Open Source AI” is a lie, but it doesn’t have to be
jacobhaimes
Apr 30, 2024, 11:10 PM
19
points
5
comments
6
min read
LW
link
(jacob-haimes.github.io)
Questions for labs
Zach Stein-Perlman
Apr 30, 2024, 10:15 PM
77
points
11
comments
8
min read
LW
link
Reality comprehensibility: are there illogical things in reality?
DDthinker
Apr 30, 2024, 9:30 PM
−3
points
0
comments
10
min read
LW
link
Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack
and
TurnTrout
Apr 30, 2024, 6:51 PM
210
points
43
comments
45
min read
LW
link
[Question]
What is the easiest/funnest way to build up a comprehensive understanding of AI and AI Safety?
Jordan Arel
Apr 30, 2024, 6:41 PM
4
points
2
comments
1
min read
LW
link
Transcoders enable fine-grained interpretable circuit analysis for language models
Jacob Dunefsky
,
Philippe Chlenski
and
Neel Nanda
Apr 30, 2024, 5:58 PM
74
points
14
comments
17
min read
LW
link
Announcing the 2024 Roots of Progress Blog-Building Intensive
jasoncrawford
Apr 30, 2024, 5:37 PM
14
points
0
comments
2
min read
LW
link
(rootsofprogress.org)
The Intentional Stance, LLMs Edition
Eleni Angelou
Apr 30, 2024, 5:12 PM
30
points
3
comments
8
min read
LW
link
Introducing AI Lab Watch
Zach Stein-Perlman
Apr 30, 2024, 5:00 PM
225
points
30
comments
1
min read
LW
link
(ailabwatch.org)
Why I’m doing PauseAI
Joseph Miller
Apr 30, 2024, 4:21 PM
108
points
16
comments
4
min read
LW
link
LLMs could be as conscious as human emulations, potentially
Canaletto
Apr 30, 2024, 11:36 AM
15
points
15
comments
3
min read
LW
link
An interesting mathematical model of how LLMs work
Bill Benzon
Apr 30, 2024, 11:01 AM
5
points
0
comments
1
min read
LW
link
Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers
hugofry
Apr 29, 2024, 8:57 PM
94
points
8
comments
11
min read
LW
link
Towards a formalization of the agent structure problem
Alex_Altair
Apr 29, 2024, 8:28 PM
55
points
6
comments
14
min read
LW
link
Ironing Out the Squiggles
Zack_M_Davis
Apr 29, 2024, 4:13 PM
157
points
36
comments
11
min read
LW
link
Super additivity of consciousness
Arturo Macias
Apr 29, 2024, 3:41 PM
−2
points
13
comments
2
min read
LW
link
AISC9 has ended and there will be an AISC10
Linda Linsefors
Apr 29, 2024, 10:53 AM
75
points
4
comments
2
min read
LW
link
Open-Source AI: A Regulatory Review
Elliot Mckernon
and
Deric Cheng
Apr 29, 2024, 10:10 AM
18
points
0
comments
8
min read
LW
link
Big-endian is better than little-endian
Menotim
Apr 29, 2024, 2:30 AM
29
points
17
comments
3
min read
LW
link
The Prop-room and Stage Cognitive Architecture
Robert Kralisch
Apr 29, 2024, 12:48 AM
14
points
4
comments
14
min read
LW
link
How are Simulators and Agents related?
Robert Kralisch
Apr 29, 2024, 12:22 AM
6
points
0
comments
7
min read
LW
link
Extended Embodiment
Robert Kralisch
Apr 29, 2024, 12:18 AM
8
points
1
comment
3
min read
LW
link
Referential Containment
Robert Kralisch
Apr 29, 2024, 12:16 AM
2
points
4
comments
3
min read
LW
link
Disentangling Competence and Intelligence
Robert Kralisch
Apr 29, 2024, 12:12 AM
23
points
7
comments
6
min read
LW
link
List your AI X-Risk cruxes!
Aryeh Englander
Apr 28, 2024, 6:26 PM
42
points
7
comments
2
min read
LW
link
Things I tell myself to be more agentic
DMMF
Apr 28, 2024, 5:44 PM
9
points
0
comments
3
min read
LW
link
(danfrank.ca)
Estimating the Number of Players from Game Result Percentages
Daniel L
Apr 28, 2024, 5:42 PM
1
point
2
comments
1
min read
LW
link
The Science Algorithm—AISC 2024 Final Presentation
Johannes C. Mayer
Apr 28, 2024, 2:55 PM
4
points
0
comments
1
min read
LW
link
(www.youtube.com)
[Aspiration-based designs] Outlook: dealing with complexity
Jobst Heitzig
,
jossoliver
,
thomasfinn
and
Simon Dima
Apr 28, 2024, 1:06 PM
13
points
3
comments
2
min read
LW
link
[Aspiration-based designs] 3. Performance and safety criteria, and aspiration intervals
Jobst Heitzig
Apr 28, 2024, 1:04 PM
10
points
0
comments
12
min read
LW
link
[Aspiration-based designs] 2. Formal framework, basic algorithm
Jobst Heitzig
,
Simon Dima
and
Simon Fischer
Apr 28, 2024, 1:02 PM
18
points
2
comments
16
min read
LW
link
[Aspiration-based designs] 1. Informal introduction
B Jacobs
,
Jobst Heitzig
,
Simon Fischer
and
Simon Dima
Apr 28, 2024, 1:00 PM
44
points
4
comments
8
min read
LW
link
Playing Northboro with Lily and Rick
jefftk
Apr 28, 2024, 2:40 AM
10
points
1
comment
2
min read
LW
link
(www.jefftk.com)
Release of UN’s draft related to the governance of AI (a summary of the Simon Institute’s response)
Sebastian Schmidt
Apr 27, 2024, 6:34 PM
7
points
0
comments
1
min read
LW
link
(forum.effectivealtruism.org)
Mercy to the Machine: Thoughts & Rights
False Name
Apr 27, 2024, 4:36 PM
7
points
6
comments
17
min read
LW
link
Constructability: Plainly-coded AGIs may be feasible in the near future
Épiphanie Gédéon
and
Charbel-Raphaël
Apr 27, 2024, 4:04 PM
91
points
13
comments
13
min read
LW
link
So What’s Up With PUFAs Chemically?
J Bostock
Apr 27, 2024, 1:32 PM
57
points
23
comments
6
min read
LW
link
Link: Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models by Jacob Pfau, William Merrill & Samuel R. Bowman
Chris_Leong
Apr 27, 2024, 1:22 PM
12
points
0
comments
1
min read
LW
link
(twitter.com)
Two Vernor Vinge Book Reviews
Maxwell Tabarrok
Apr 27, 2024, 12:14 PM
17
points
0
comments
2
min read
LW
link
(www.maximum-progress.com)
Refusal in LLMs is mediated by a single direction
Andy Arditi
,
Oscar Obeso
,
Aaquib111
,
wesg
and
Neel Nanda
Apr 27, 2024, 11:13 AM
247
points
95
comments
10
min read
LW
link
[Question]
Plausibility of Getting Early Warning Shots because AIs can’t coordinate?
hmys
Apr 27, 2024, 8:02 AM
12
points
0
comments
1
min read
LW
link
AI Safety Sphere
Myles H
Apr 27, 2024, 1:49 AM
6
points
2
comments
2
min read
LW
link
Exploring the Esoteric Pathways to AI Sentience (Part One)
jeffreycaruso
Apr 27, 2024, 1:02 AM
−11
points
6
comments
2
min read
LW
link
Superposition is not “just” neuron polysemanticity
LawrenceC
Apr 26, 2024, 11:22 PM
66
points
4
comments
13
min read
LW
link
D&D.Sci Long War: Defender of Data-mocracy
aphyer
Apr 26, 2024, 10:30 PM
44
points
20
comments
4
min read
LW
link
On Not Pulling The Ladder Up Behind You
Screwtape
Apr 26, 2024, 9:58 PM
189
points
21
comments
9
min read
LW
link
We are headed into an extreme compute overhang
devrandom
Apr 26, 2024, 9:38 PM
54
points
34
comments
2
min read
LW
link
[Concept Dependency] Edge Regular Lattice Graph
Johannes C. Mayer
Apr 26, 2024, 9:14 PM
9
points
1
comment
1
min read
LW
link
[Concept Dependency] Concept Dependency Posts
Johannes C. Mayer
Apr 26, 2024, 8:57 PM
10
points
3
comments
2
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel