RSS

Zac Hatfield-Dodds(Zac Hatfield-Dodds)

Karma: 2,066

Technical staff at Anthropic, previously #3ainstitute; interdisciplinary, interested in everything; ongoing PhD in CS (learning /​ testing /​ verification), open sourcerer, more at zhd.dev

Third-party test­ing as a key in­gre­di­ent of AI policy

Zac Hatfield-Dodds25 Mar 2024 22:40 UTC
11 points
1 comment12 min readLW link
(www.anthropic.com)

Dario Amodei’s pre­pared re­marks from the UK AI Safety Sum­mit, on An­thropic’s Re­spon­si­ble Scal­ing Policy

Zac Hatfield-Dodds1 Nov 2023 18:10 UTC
85 points
1 comment4 min readLW link
(www.anthropic.com)

Towards Monose­man­tic­ity: De­com­pos­ing Lan­guage Models With Dic­tionary Learning

Zac Hatfield-Dodds5 Oct 2023 21:01 UTC
286 points
19 comments2 min readLW link
(transformer-circuits.pub)

An­thropic’s Re­spon­si­ble Scal­ing Policy & Long-Term Benefit Trust

Zac Hatfield-Dodds19 Sep 2023 15:09 UTC
90 points
23 comments3 min readLW link
(www.anthropic.com)

An­thropic’s Core Views on AI Safety

Zac Hatfield-Dodds9 Mar 2023 16:55 UTC
181 points
39 comments2 min readLW link
(www.anthropic.com)

Con­crete Rea­sons for Hope about AI

Zac Hatfield-Dodds14 Jan 2023 1:22 UTC
101 points
13 comments1 min readLW link

In Defence of Spock

Zac Hatfield-Dodds21 Apr 2021 21:34 UTC
35 points
5 comments1 min readLW link

Zac Hat­field Dodds’s Shortform

Zac Hatfield-Dodds9 Mar 2021 2:39 UTC
2 points
3 comments1 min readLW link