Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
loops
Karma:
337
I’m Smitty; I also go by loops here. Most of my posts are on my website:
https://iter.ca
All
Posts
Comments
New
Top
Old
NLA explanations can be shortened without harming reconstruction
loops
22 Jun 2026 0:57 UTC
46
points
4
comments
3
min read
LW
link
Some observations about NLA explanations
loops
15 May 2026 2:15 UTC
21
points
0
comments
3
min read
LW
link
Latent reasoning models might be a good thing?
loops
28 Apr 2026 6:46 UTC
17
points
2
comments
3
min read
LW
link
Why I’m excited about meta-models for interpretability
loops
12 Apr 2026 4:30 UTC
12
points
0
comments
4
min read
LW
link
Why was cybersecurity automated before AI R&D?
loops
8 Apr 2026 1:08 UTC
23
points
1
comment
3
min read
LW
link
Positive sum doesn’t mean “win-win”
loops
5 Apr 2026 2:33 UTC
50
points
5
comments
2
min read
LW
link
What secret goals does Claude think it has?
loops
25 Feb 2026 19:22 UTC
94
points
11
comments
4
min read
LW
link
Jailbreaking language models with user roleplay
loops
28 Sep 2024 23:43 UTC
9
points
0
comments
3
min read
LW
link
(iter.ca)
Back to top