LawrenceC(Lawrence Chan)

Karma: 4,799

I do AI Alignment research. Currently independent, but previously at: METR, Redwood, UC Berkeley, Good Judgment Project.

I’m also a part-time fund manager for the LTFF.

Obligatory research billboard website: https://chanlawrence.me/

What I would do if I wasn’t at ARC Evals

LawrenceC5 Sep 2023 19:19 UTC

212 points

8 comments13 min readLW link

Natural Abstractions: Key claims, Theorems, and Critiques

LawrenceC, Leon Lang and Erik Jenner

16 Mar 2023 16:37 UTC

206 points

20 comments45 min readLW link

Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC, Adrià Garriga-alonso, Nicholas Goldowsky-Dill, ryan_greenblatt, jenny, Ansh Radhakrishnan, Buck and Nate Thomas

3 Dec 2022 0:58 UTC

195 points

35 comments20 min readLW link 1 review

Sam Altman fired from OpenAI

LawrenceC17 Nov 2023 20:42 UTC

192 points

75 comments1 min readLW link

(openai.com)

Shard Theory in Nine Theses: a Distillation and Critical Appraisal

LawrenceC19 Dec 2022 22:52 UTC

138 points

30 comments18 min readLW link

Evaluations (of new AI Safety researchers) can be noisy

LawrenceC5 Feb 2023 4:15 UTC

130 points

10 comments16 min readLW link

Anthropic release Claude 3, claims >GPT-4 Performance

LawrenceC4 Mar 2024 18:23 UTC

114 points

40 comments2 min readLW link

(www.anthropic.com)

Touch reality as soon as possible (when doing machine learning research)

LawrenceC3 Jan 2023 19:11 UTC

107 points

7 comments8 min readLW link

Sam Altman: “Planning for AGI and beyond”

LawrenceC24 Feb 2023 20:28 UTC

104 points

54 comments6 min readLW link

(openai.com)

Paper: Discovering novel algorithms with AlphaTensor [Deepmind]

LawrenceC5 Oct 2022 16:20 UTC

82 points

18 comments1 min readLW link

(www.deepmind.com)

OpenAI/Microsoft announce “next generation language model” integrated into Bing/Edge

LawrenceC7 Feb 2023 20:38 UTC

79 points

4 comments1 min readLW link

(blogs.microsoft.com)

Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)

LawrenceC16 Dec 2022 22:12 UTC

68 points

11 comments1 min readLW link

(www.anthropic.com)

Paper: The Capacity for Moral Self-Correction in Large Language Models (Anthropic)

LawrenceC16 Feb 2023 19:47 UTC

65 points

9 comments1 min readLW link

(arxiv.org)

Open Phil releases RFPs on LLM Benchmarks and Forecasting

LawrenceC11 Nov 2023 3:01 UTC

53 points

0 comments2 min readLW link

(www.openphilanthropy.org)

Paper: Superposition, Memorization, and Double Descent (Anthropic)

LawrenceC5 Jan 2023 17:54 UTC

53 points

11 comments1 min readLW link

(transformer-circuits.pub)

Superposition is not “just” neuron polysemanticity

LawrenceC26 Apr 2024 23:22 UTC

50 points

4 comments13 min readLW link

Meta announces Llama 2; “open sources” it for commercial use

LawrenceC18 Jul 2023 19:28 UTC

46 points

12 comments1 min readLW link

(about.fb.com)

[Appendix] Natural Abstractions: Key Claims, Theorems, and Critiques

LawrenceC, Erik Jenner and Leon Lang

16 Mar 2023 16:38 UTC

46 points

0 comments13 min readLW link

Meta “open sources” LMs competitive with Chinchilla, PaLM, and code-davinci-002 (Paper)

LawrenceC24 Feb 2023 19:57 UTC

38 points

19 comments1 min readLW link

(research.facebook.com)

Causal scrubbing: results on induction heads

LawrenceC, Adrià Garriga-alonso, Nicholas Goldowsky-Dill, ryan_greenblatt, Tao Lin, jenny, Ansh Radhakrishnan, Buck and Nate Thomas

3 Dec 2022 0:59 UTC

34 points

1 comment17 min readLW link