Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
New
Hot
Active
Old
Page
1
The origins of the steam engine: An essay with interactive animated diagrams
jasoncrawford
29 Nov 2023 18:30 UTC
11
points
0
comments
1
min read
LW
link
(rootsofprogress.org)
ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5
VipulNaik
29 Nov 2023 18:11 UTC
19
points
1
comment
14
min read
LW
link
“Clean” vs. “messy” goal-directedness (Section 2.2.3 of “Scheming AIs”)
Joe Carlsmith
29 Nov 2023 16:32 UTC
10
points
0
comments
11
min read
LW
link
[Question]
Thoughts on teletransportation with copies?
titotal
29 Nov 2023 12:56 UTC
11
points
10
comments
1
min read
LW
link
Intro to Superposition & Sparse Autoencoders (Colab exercises)
CallumMcDougall
29 Nov 2023 12:56 UTC
27
points
0
comments
2
min read
LW
link
The 101 Space You Will Always Have With You
Screwtape
29 Nov 2023 4:56 UTC
85
points
7
comments
6
min read
LW
link
Trust your intuition—Kahneman’s book misses the forest for the trees
mnvr
29 Nov 2023 4:37 UTC
−1
points
2
comments
2
min read
LW
link
Deception Chess: Game #2
Zane
29 Nov 2023 2:43 UTC
23
points
12
comments
2
min read
LW
link
Black Box Biology
GeneSmith
29 Nov 2023 2:27 UTC
43
points
10
comments
2
min read
LW
link
[Question]
What would be the shelf life of nuclear weapon-secrecy if nuclear weapons had not immediately been used in combat?
Gram Stone
29 Nov 2023 0:53 UTC
7
points
1
comment
1
min read
LW
link
Scaling laws for dominant assurance contracts
jessicata
28 Nov 2023 23:11 UTC
23
points
0
comments
6
min read
LW
link
(unstableontology.com)
I’m confused about innate smell neuroanatomy
Steven Byrnes
28 Nov 2023 20:49 UTC
33
points
0
comments
7
min read
LW
link
How to Control an LLM’s Behavior (why my P(DOOM) went down)
RogerDearnaley
28 Nov 2023 19:56 UTC
46
points
21
comments
10
min read
LW
link
Update #2 to “Dominant Assurance Contract Platform”: EnsureDone
moyamo
28 Nov 2023 18:02 UTC
33
points
2
comments
1
min read
LW
link
Ethicophysics II: Politics is the Mind-Savior
MadHatter
28 Nov 2023 16:27 UTC
−23
points
4
comments
4
min read
LW
link
(bittertruths.substack.com)
Agentic Growth
Logan Kieller
28 Nov 2023 15:45 UTC
8
points
0
comments
3
min read
LW
link
(logankieller.substack.com)
AISC project: How promising is automating alignment research? (literature review)
Bogdan Ionut Cirstea
28 Nov 2023 14:47 UTC
4
points
1
comment
1
min read
LW
link
(docs.google.com)
A day in the life of a mechanistic interpretability researcher
Bill Benzon
28 Nov 2023 14:45 UTC
3
points
3
comments
1
min read
LW
link
Two sources of beyond-episode goals (Section 2.2.2 of “Scheming AIs”)
Joe Carlsmith
28 Nov 2023 13:49 UTC
10
points
0
comments
15
min read
LW
link
Self-Referential Probabilistic Logic Admits the Payor’s Lemma
Yudhister Kumar
28 Nov 2023 10:27 UTC
62
points
7
comments
4
min read
LW
link
Back to top
Next