RSS

aogara

Karma: 1,421

Research Engineering Intern at the Center for AI Safety. Helping to write the AI Safety Newsletter. Studying CS and Economics at the University of Southern California, and running an AI safety club there. Previously worked at AI Impacts and with Lionel Levine and Collin Burns on calibration for Detecting Latent Knowledge Without Supervision.

Anal­y­sis: US re­stricts GPU sales to China

aogara7 Oct 2022 18:38 UTC
102 points
58 comments5 min readLW link

Ar­gu­ment against 20% GDP growth from AI within 10 years [Linkpost]

aogara12 Sep 2022 4:08 UTC
59 points
21 comments5 min readLW link
(twitter.com)

Key Papers in Lan­guage Model Safety

aogara20 Jun 2022 15:00 UTC
39 points
1 comment22 min readLW link

AISN #25: White House Ex­ec­u­tive Order on AI, UK AI Safety Sum­mit, and Progress on Vol­un­tary Eval­u­a­tions of AI Risks

31 Oct 2023 19:34 UTC
35 points
1 comment6 min readLW link
(newsletter.safe.ai)

AISN #28: Cen­ter for AI Safety 2023 Year in Review

23 Dec 2023 21:31 UTC
30 points
1 comment5 min readLW link
(newsletter.safe.ai)

Ad­ver­sar­ial Ro­bust­ness Could Help Prevent Catas­trophic Misuse

aogara11 Dec 2023 19:12 UTC
30 points
18 comments9 min readLW link

Full Au­toma­tion is Un­likely and Un­nec­es­sary for Ex­plo­sive Growth

aogara31 May 2023 21:55 UTC
28 points
3 comments5 min readLW link

AISN #30: In­vest­ments in Com­pute and Mili­tary AI Plus, Ja­pan and Sin­ga­pore’s Na­tional AI Safety Institutes

24 Jan 2024 19:38 UTC
27 points
1 comment6 min readLW link
(newsletter.safe.ai)

Emer­gent Abil­ities of Large Lan­guage Models [Linkpost]

aogara10 Aug 2022 18:02 UTC
25 points
2 comments1 min readLW link
(arxiv.org)

AISN #19: US-China Com­pe­ti­tion on AI Chips, Mea­sur­ing Lan­guage Agent Devel­op­ments, Eco­nomic Anal­y­sis of Lan­guage Model Pro­pa­ganda, and White House AI Cy­ber Challenge

15 Aug 2023 16:10 UTC
21 points
0 comments5 min readLW link
(newsletter.safe.ai)

Model-driven feed­back could am­plify al­ign­ment failures

aogara30 Jan 2023 0:00 UTC
21 points
1 comment2 min readLW link

Git Re-Basin: Merg­ing Models mod­ulo Per­mu­ta­tion Sym­me­tries [Linkpost]

aogara14 Sep 2022 8:55 UTC
21 points
0 comments2 min readLW link
(arxiv.org)

AISN #22: The Land­scape of US AI Leg­is­la­tion - Hear­ings, Frame­works, Bills, and Laws

19 Sep 2023 14:44 UTC
20 points
0 comments5 min readLW link
(newsletter.safe.ai)

Yud­kowsky Con­tra Chris­ti­ano on AI Take­off Speeds [Linkpost]

aogara5 Apr 2022 2:09 UTC
18 points
0 comments11 min readLW link

MLSN: #10 Ad­ver­sar­ial At­tacks Against Lan­guage and Vi­sion Models, Im­prov­ing LLM Hon­esty, and Trac­ing the In­fluence of LLM Train­ing Data

13 Sep 2023 18:03 UTC
15 points
1 comment5 min readLW link
(newsletter.mlsafety.org)

AISN #21: Google Deep­Mind’s GPT-4 Com­peti­tor, Mili­tary In­vest­ments in Au­tonomous Drones, The UK AI Safety Sum­mit, and Case Stud­ies in AI Policy

5 Sep 2023 15:03 UTC
15 points
0 comments5 min readLW link
(newsletter.safe.ai)

AISN #23: New OpenAI Models, News from An­thropic, and Rep­re­sen­ta­tion Engineering

4 Oct 2023 17:37 UTC
15 points
2 comments5 min readLW link
(newsletter.safe.ai)

Bench­mark­ing LLM Agents on Kag­gle Competitions

aogara22 Mar 2024 13:09 UTC
15 points
1 comment5 min readLW link

AISN #24: Kiss­inger Urges US-China Co­op­er­a­tion on AI, China’s New AI Law, US Ex­port Con­trols, In­ter­na­tional In­sti­tu­tions, and Open Source AI

18 Oct 2023 17:06 UTC
14 points
0 comments6 min readLW link
(newsletter.safe.ai)

Hood­winked: Eval­u­at­ing De­cep­tion Ca­pa­bil­ities in Large Lan­guage Models

aogara25 Aug 2023 19:39 UTC
14 points
3 comments3 min readLW link