NickGabs

Karma: 387

Grantmaker at Coefficient Giving

Steering Llama-2 with contrastive activation additions

Nina Panickssery, Wuschel Schulz, NickGabs, Meg, evhub and TurnTrout

2 Jan 2024 0:47 UTC

125 points

29 comments8 min readLW link

(arxiv.org)

Science of Deep Learning more tractably addresses the Sharp Left Turn than Agent Foundations

NickGabs19 Sep 2023 22:06 UTC

22 points

2 comments6 min readLW link

An upcoming US Supreme Court case may impede AI governance efforts

NickGabs16 Jul 2023 23:51 UTC

57 points

17 comments2 min readLW link

Empirical Evidence Against “The Longest Training Run”

NickGabs6 Jul 2023 18:32 UTC

31 points

1 comment14 min readLW link

Proposal: labs should precommit to pausing if an AI argues for itself to be improved

NickGabs2 Jun 2023 22:31 UTC

3 points

3 comments4 min readLW link

AI Doom Is Not (Only) Disjunctive

NickGabs30 Mar 2023 1:42 UTC

12 points

0 comments5 min readLW link

We Need Holistic AI Macrostrategy

NickGabs15 Jan 2023 2:13 UTC

39 points

4 comments8 min readLW link

Takeoff speeds, the chimps analogy, and the Cultural Intelligence Hypothesis

NickGabs2 Dec 2022 19:14 UTC

16 points

2 comments4 min readLW link

Miscellaneous First-Pass Alignment Thoughts

NickGabs21 Nov 2022 21:23 UTC

12 points

4 comments10 min readLW link

Distillation of “How Likely Is Deceptive Alignment?”

NickGabs18 Nov 2022 16:31 UTC

24 points

4 comments10 min readLW link