RSS

aogara

Karma: 1,421

Research Engineering Intern at the Center for AI Safety. Helping to write the AI Safety Newsletter. Studying CS and Economics at the University of Southern California, and running an AI safety club there. Previously worked at AI Impacts and with Lionel Levine and Collin Burns on calibration for Detecting Latent Knowledge Without Supervision.

[Link] Did AlphaS­tar just click faster?

aogara28 Jan 2019 20:23 UTC
4 points
14 comments1 min readLW link

Yud­kowsky Con­tra Chris­ti­ano on AI Take­off Speeds [Linkpost]

aogara5 Apr 2022 2:09 UTC
18 points
0 comments11 min readLW link

Key Papers in Lan­guage Model Safety

aogara20 Jun 2022 15:00 UTC
39 points
1 comment22 min readLW link

Emer­gent Abil­ities of Large Lan­guage Models [Linkpost]

aogara10 Aug 2022 18:02 UTC
25 points
2 comments1 min readLW link
(arxiv.org)

ML Model At­tri­bu­tion Challenge [Linkpost]

aogara30 Aug 2022 19:34 UTC
11 points
0 comments1 min readLW link
(mlmac.io)

Ar­gu­ment against 20% GDP growth from AI within 10 years [Linkpost]

aogara12 Sep 2022 4:08 UTC
59 points
21 comments5 min readLW link
(twitter.com)

Git Re-Basin: Merg­ing Models mod­ulo Per­mu­ta­tion Sym­me­tries [Linkpost]

aogara14 Sep 2022 8:55 UTC
21 points
0 comments2 min readLW link
(arxiv.org)

Anal­y­sis: US re­stricts GPU sales to China

aogara7 Oct 2022 18:38 UTC
102 points
58 comments5 min readLW link

Model-driven feed­back could am­plify al­ign­ment failures

aogara30 Jan 2023 0:00 UTC
21 points
1 comment2 min readLW link

Full Au­toma­tion is Un­likely and Un­nec­es­sary for Ex­plo­sive Growth

aogara31 May 2023 21:55 UTC
28 points
3 comments5 min readLW link

Learn­ing Trans­former Pro­grams [Linkpost]

aogara8 Jun 2023 0:16 UTC
7 points
0 comments1 min readLW link
(arxiv.org)

AISN #16: White House Se­cures Vol­un­tary Com­mit­ments from Lead­ing AI Labs and Les­sons from Oppenheimer

1 Aug 2023 15:39 UTC
3 points
0 comments6 min readLW link
(newsletter.safe.ai)

AISN #17: Au­to­mat­i­cally Cir­cum­vent­ing LLM Guardrails, the Fron­tier Model Fo­rum, and Se­nate Hear­ing on AI Oversight

1 Aug 2023 15:40 UTC
8 points
0 comments8 min readLW link
(newsletter.safe.ai)

AISN #18: Challenges of Re­in­force­ment Learn­ing from Hu­man Feed­back, Microsoft’s Se­cu­rity Breach, and Con­cep­tual Re­search on AI Safety

aogara8 Aug 2023 15:52 UTC
13 points
0 comments1 min readLW link
(newsletter.safe.ai)

AISN #19: US-China Com­pe­ti­tion on AI Chips, Mea­sur­ing Lan­guage Agent Devel­op­ments, Eco­nomic Anal­y­sis of Lan­guage Model Pro­pa­ganda, and White House AI Cy­ber Challenge

15 Aug 2023 16:10 UTC
21 points
0 comments5 min readLW link
(newsletter.safe.ai)

Hood­winked: Eval­u­at­ing De­cep­tion Ca­pa­bil­ities in Large Lan­guage Models

aogara25 Aug 2023 19:39 UTC
14 points
3 comments3 min readLW link

AISN #20: LLM Pro­lifer­a­tion, AI De­cep­tion, and Con­tin­u­ing Drivers of AI Capabilities

29 Aug 2023 15:07 UTC
12 points
0 comments8 min readLW link
(newsletter.safe.ai)

AISN #21: Google Deep­Mind’s GPT-4 Com­peti­tor, Mili­tary In­vest­ments in Au­tonomous Drones, The UK AI Safety Sum­mit, and Case Stud­ies in AI Policy

5 Sep 2023 15:03 UTC
15 points
0 comments5 min readLW link
(newsletter.safe.ai)

MLSN: #10 Ad­ver­sar­ial At­tacks Against Lan­guage and Vi­sion Models, Im­prov­ing LLM Hon­esty, and Trac­ing the In­fluence of LLM Train­ing Data

13 Sep 2023 18:03 UTC
15 points
1 comment5 min readLW link
(newsletter.mlsafety.org)

AISN #22: The Land­scape of US AI Leg­is­la­tion - Hear­ings, Frame­works, Bills, and Laws

19 Sep 2023 14:44 UTC
20 points
0 comments5 min readLW link
(newsletter.safe.ai)