aogara

Karma: 1,421

Research Engineering Intern at the Center for AI Safety. Helping to write the AI Safety Newsletter. Studying CS and Economics at the University of Southern California, and running an AI safety club there. Previously worked at AI Impacts and with Lionel Levine and Collin Burns on calibration for Detecting Latent Knowledge Without Supervision.

[Link] Did AlphaStar just click faster?

aogara28 Jan 2019 20:23 UTC

4 points

14 comments1 min readLW link

Yudkowsky Contra Christiano on AI Takeoff Speeds [Linkpost]

aogara5 Apr 2022 2:09 UTC

18 points

0 comments11 min readLW link

Key Papers in Language Model Safety

aogara20 Jun 2022 15:00 UTC

39 points

1 comment22 min readLW link

Emergent Abilities of Large Language Models [Linkpost]

aogara10 Aug 2022 18:02 UTC

25 points

2 comments1 min readLW link

(arxiv.org)

ML Model Attribution Challenge [Linkpost]

aogara30 Aug 2022 19:34 UTC

11 points

0 comments1 min readLW link

(mlmac.io)

Argument against 20% GDP growth from AI within 10 years [Linkpost]

aogara12 Sep 2022 4:08 UTC

59 points

21 comments5 min readLW link

(twitter.com)

Git Re-Basin: Merging Models modulo Permutation Symmetries [Linkpost]

aogara14 Sep 2022 8:55 UTC

21 points

0 comments2 min readLW link

(arxiv.org)

Analysis: US restricts GPU sales to China

aogara7 Oct 2022 18:38 UTC

102 points

58 comments5 min readLW link

Model-driven feedback could amplify alignment failures

aogara30 Jan 2023 0:00 UTC

21 points

1 comment2 min readLW link

Full Automation is Unlikely and Unnecessary for Explosive Growth

aogara31 May 2023 21:55 UTC

28 points

3 comments5 min readLW link

Learning Transformer Programs [Linkpost]

aogara8 Jun 2023 0:16 UTC

7 points

0 comments1 min readLW link

(arxiv.org)

AISN #16: White House Secures Voluntary Commitments from Leading AI Labs and Lessons from Oppenheimer

aogara, Dan H and Corin Katzke

1 Aug 2023 15:39 UTC

3 points

0 comments6 min readLW link

(newsletter.safe.ai)

AISN #17: Automatically Circumventing LLM Guardrails, the Frontier Model Forum, and Senate Hearing on AI Oversight

aogara and Dan H

1 Aug 2023 15:40 UTC

8 points

0 comments8 min readLW link

(newsletter.safe.ai)

AISN #18: Challenges of Reinforcement Learning from Human Feedback, Microsoft’s Security Breach, and Conceptual Research on AI Safety

aogara8 Aug 2023 15:52 UTC

13 points

0 comments1 min readLW link

(newsletter.safe.ai)

AISN #19: US-China Competition on AI Chips, Measuring Language Agent Developments, Economic Analysis of Language Model Propaganda, and White House AI Cyber Challenge

aogara and Dan H

15 Aug 2023 16:10 UTC

21 points

0 comments5 min readLW link

(newsletter.safe.ai)

Hoodwinked: Evaluating Deception Capabilities in Large Language Models

aogara25 Aug 2023 19:39 UTC

14 points

3 comments3 min readLW link

AISN #20: LLM Proliferation, AI Deception, and Continuing Drivers of AI Capabilities

aogara and Dan H

29 Aug 2023 15:07 UTC

12 points

0 comments8 min readLW link

(newsletter.safe.ai)

AISN #21: Google DeepMind’s GPT-4 Competitor, Military Investments in Autonomous Drones, The UK AI Safety Summit, and Case Studies in AI Policy

aogara and Dan H

5 Sep 2023 15:03 UTC

15 points

0 comments5 min readLW link

(newsletter.safe.ai)

MLSN: #10 Adversarial Attacks Against Language and Vision Models, Improving LLM Honesty, and Tracing the Influence of LLM Training Data

aogara and Dan H

13 Sep 2023 18:03 UTC

15 points

1 comment5 min readLW link

(newsletter.mlsafety.org)

AISN #22: The Landscape of US AI Legislation - Hearings, Frameworks, Bills, and Laws

aogara and Dan H

19 Sep 2023 14:44 UTC

20 points

0 comments5 min readLW link

(newsletter.safe.ai)