Ansh Radhakrishnan

Karma: 561

An Inside View of AI Alignment

Ansh Radhakrishnan11 May 2022 2:16 UTC

32 points

2 comments2 min readLW link

RLHF

Ansh Radhakrishnan12 May 2022 21:18 UTC

18 points

5 comments5 min readLW link

The Bio Anchors Forecast

Ansh Radhakrishnan2 Jun 2022 1:32 UTC

12 points

0 comments3 min readLW link

Measuring and Improving the Faithfulness of Model-Generated Reasoning

Ansh Radhakrishnan, tamera, karinanguyen, Sam Bowman and Ethan Perez

18 Jul 2023 16:36 UTC

109 points

13 comments6 min readLW link

Anthropic Fall 2023 Debate Progress Update

Ansh Radhakrishnan28 Nov 2023 5:37 UTC

74 points

9 comments12 min readLW link

Scalable Oversight and Weak-to-Strong Generalization: Compatible approaches to the same problem

Ansh Radhakrishnan, Buck, ryan_greenblatt and Fabien Roger

16 Dec 2023 5:49 UTC

72 points

3 comments6 min readLW link