Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
Superposition is not “just” neuron polysemanticity
LawrenceC
26 Apr 2024 23:22 UTC
32
points
0
comments
13
min read
LW
link
D&D.Sci Long War: Defender of Data-mocracy
aphyer
26 Apr 2024 22:30 UTC
33
points
0
comments
3
min read
LW
link
On Not Pulling The Ladder Up Behind You
Screwtape
26 Apr 2024 21:58 UTC
44
points
2
comments
9
min read
LW
link
We are headed into an extreme compute overhang
devrandom
26 Apr 2024 21:38 UTC
22
points
5
comments
2
min read
LW
link
[Concept Dependency] Edge Regular Lattice Graph
Johannes C. Mayer
26 Apr 2024 21:14 UTC
5
points
0
comments
1
min read
LW
link
[Concept Dependency] Concept Dependency Posts
Johannes C. Mayer
26 Apr 2024 20:57 UTC
8
points
2
comments
2
min read
LW
link
[Question]
Wouldn’t weak AI agents provide warning?
Mandatory Topic
26 Apr 2024 19:34 UTC
5
points
0
comments
1
min read
LW
link
Duct Tape security
Isaac King
26 Apr 2024 18:57 UTC
56
points
7
comments
5
min read
LW
link
Fundamental Uncertainty: Chapter 8 - When does fundamental uncertainty matter?
Gordon Seidoh Worley
26 Apr 2024 18:10 UTC
8
points
2
comments
32
min read
LW
link
Scaling of AI training runs will slow down after GPT-5
Maxime Riché
26 Apr 2024 16:05 UTC
32
points
5
comments
3
min read
LW
link
Spatial attention as a “tell” for empathetic simulation?
Steven Byrnes
26 Apr 2024 15:10 UTC
50
points
7
comments
8
min read
LW
link
Arch-anarchy
Peter lawless
26 Apr 2024 15:05 UTC
−1
points
1
comment
25
min read
LW
link
An Introduction to AI Sandbagging
Teun van der Weij
,
Felix Hofstätter
and
Francis Rhys Ward
26 Apr 2024 13:40 UTC
26
points
0
comments
8
min read
LW
link
LLMs seem (relatively) safe
JustisMills
25 Apr 2024 22:13 UTC
41
points
12
comments
7
min read
LW
link
(justismills.substack.com)
Losing Faith In Contrarianism
omnizoid
25 Apr 2024 20:53 UTC
38
points
26
comments
5
min read
LW
link
Why I stopped being into basin broadness
tailcalled
25 Apr 2024 20:47 UTC
14
points
1
comment
2
min read
LW
link
AXRP Episode 29 - Science of Deep Learning with Vikrant Varma
DanielFilan
25 Apr 2024 19:10 UTC
18
points
1
comment
63
min read
LW
link
Improving Dictionary Learning with Gated Sparse Autoencoders
Neel Nanda
,
Senthooran Rajamanoharan
,
Arthur Conmy
,
lsgos
,
Tom Lieberum
,
Vikrant Varma
,
János Kramár
and
Rohin Shah
25 Apr 2024 18:43 UTC
60
points
23
comments
1
min read
LW
link
(arxiv.org)
“Why I Write” by George Orwell (1946)
Arjun Panickssery
25 Apr 2024 16:02 UTC
52
points
3
comments
9
min read
LW
link
(www.orwellfoundation.com)
Knowledge Base 8: The truth as an attractor in the information space
iwis
25 Apr 2024 15:28 UTC
−10
points
0
comments
2
min read
LW
link
Back to top
Next