RSS

NickGabs

Karma: 373

Steer­ing Llama-2 with con­trastive ac­ti­va­tion additions

2 Jan 2024 0:47 UTC
118 points
29 comments8 min readLW link
(arxiv.org)

Science of Deep Learn­ing more tractably ad­dresses the Sharp Left Turn than Agent Foundations

NickGabs19 Sep 2023 22:06 UTC
22 points
2 comments6 min readLW link

An up­com­ing US Supreme Court case may im­pede AI gov­er­nance efforts

NickGabs16 Jul 2023 23:51 UTC
57 points
17 comments2 min readLW link

Em­piri­cal Ev­i­dence Against “The Longest Train­ing Run”

NickGabs6 Jul 2023 18:32 UTC
24 points
0 comments14 min readLW link

Pro­posal: labs should pre­com­mit to paus­ing if an AI ar­gues for it­self to be improved

NickGabs2 Jun 2023 22:31 UTC
3 points
3 comments4 min readLW link

AI Doom Is Not (Only) Disjunctive

NickGabs30 Mar 2023 1:42 UTC
12 points
0 comments5 min readLW link

We Need Holis­tic AI Macrostrategy

NickGabs15 Jan 2023 2:13 UTC
39 points
4 comments8 min readLW link

Take­off speeds, the chimps anal­ogy, and the Cul­tural In­tel­li­gence Hypothesis

NickGabs2 Dec 2022 19:14 UTC
16 points
2 comments4 min readLW link

Mis­cel­la­neous First-Pass Align­ment Thoughts

NickGabs21 Nov 2022 21:23 UTC
12 points
4 comments10 min readLW link

Distil­la­tion of “How Likely Is De­cep­tive Align­ment?”

NickGabs18 Nov 2022 16:31 UTC
24 points
4 comments10 min readLW link