Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
aog
Karma:
1,613
AI grantmaker at Longview Philanthropy and AI DPhil student at Oxford
All
Posts
Comments
New
Top
Old
Digital sentience funding opportunities: Support for applied work and research
aog
and
zdgroff
29 May 2025 15:22 UTC
21
points
0
comments
4
min read
LW
link
Research Priorities for Hardware-Enabled Mechanisms (HEMs)
aog
30 Apr 2025 17:43 UTC
17
points
2
comments
15
min read
LW
link
(www.longview.org)
aog’s Shortform
aog
19 Apr 2025 22:07 UTC
6
points
21
comments
1
min read
LW
link
Benchmarking LLM Agents on Kaggle Competitions
aog
22 Mar 2024 13:09 UTC
15
points
4
comments
5
min read
LW
link
Adversarial Robustness Could Help Prevent Catastrophic Misuse
aog
11 Dec 2023 19:12 UTC
30
points
18
comments
9
min read
LW
link
Unsupervised Methods for Concept Discovery in AlphaZero
aog
26 Oct 2023 19:05 UTC
9
points
0
comments
1
min read
LW
link
(arxiv.org)
MLSN: #10 Adversarial Attacks Against Language and Vision Models, Improving LLM Honesty, and Tracing the Influence of LLM Training Data
aog
and
Dan H
13 Sep 2023 18:03 UTC
15
points
1
comment
5
min read
LW
link
(newsletter.mlsafety.org)
Hoodwinked: Evaluating Deception Capabilities in Large Language Models
aog
25 Aug 2023 19:39 UTC
25
points
3
comments
3
min read
LW
link
Learning Transformer Programs [Linkpost]
aog
8 Jun 2023 0:16 UTC
7
points
0
comments
1
min read
LW
link
(arxiv.org)
Full Automation is Unlikely and Unnecessary for Explosive Growth
aog
31 May 2023 21:55 UTC
28
points
3
comments
5
min read
LW
link
Model-driven feedback could amplify alignment failures
aog
30 Jan 2023 0:00 UTC
21
points
1
comment
2
min read
LW
link
Analysis: US restricts GPU sales to China
aog
7 Oct 2022 18:38 UTC
102
points
58
comments
5
min read
LW
link
Git Re-Basin: Merging Models modulo Permutation Symmetries [Linkpost]
aog
14 Sep 2022 8:55 UTC
21
points
0
comments
2
min read
LW
link
(arxiv.org)
Argument against 20% GDP growth from AI within 10 years [Linkpost]
aog
12 Sep 2022 4:08 UTC
59
points
20
comments
5
min read
LW
link
(twitter.com)
ML Model Attribution Challenge [Linkpost]
aog
30 Aug 2022 19:34 UTC
11
points
0
comments
1
min read
LW
link
(mlmac.io)
Emergent Abilities of Large Language Models [Linkpost]
aog
10 Aug 2022 18:02 UTC
25
points
2
comments
1
min read
LW
link
(arxiv.org)
Key Papers in Language Model Safety
aog
20 Jun 2022 15:00 UTC
40
points
1
comment
22
min read
LW
link
Yudkowsky Contra Christiano on AI Takeoff Speeds [Linkpost]
aog
5 Apr 2022 2:09 UTC
18
points
0
comments
11
min read
LW
link
[Link] Did AlphaStar just click faster?
aog
28 Jan 2019 20:23 UTC
4
points
14
comments
1
min read
LW
link
Back to top