RSS

MadHatter

Karma: 324

“We are computer scientists. We do not lack in faith.” (Ketan Mulmuley)

Try­ing to Make a Treach­er­ous Mesa-Optimizer

MadHatter9 Nov 2022 18:07 UTC
95 points
14 comments4 min readLW link
(attentionspan.blog)

Teaser: Hard-cod­ing Trans­former Models

MadHatter12 Dec 2021 22:04 UTC
74 points
19 comments1 min readLW link

A mechanis­tic ex­pla­na­tion for SolidGoldMag­ikarp-like to­kens in GPT2

MadHatter26 Feb 2023 1:10 UTC
61 points
14 comments6 min readLW link

Hard-Cod­ing Neu­ral Computation

MadHatter13 Dec 2021 4:35 UTC
34 points
8 comments27 min readLW link

In­ter­ven­ing in the Resi­d­ual Stream

MadHatter22 Feb 2023 6:29 UTC
30 points
1 comment9 min readLW link

[Question] Stupid Ques­tion: Why am I get­ting con­sis­tently down­voted?

MadHatter30 Nov 2023 0:21 UTC
18 points
124 comments1 min readLW link

We are Peace­craft.ai!

MadHatter16 Nov 2023 14:15 UTC
15 points
20 comments2 min readLW link

Let­ter to a Sonoma County Jail Cell

MadHatter18 Nov 2023 2:24 UTC
11 points
1 comment1 min readLW link
(open.substack.com)

[Question] Fea­ture Re­quest for LessWrong

MadHatter30 Nov 2023 3:19 UTC
11 points
8 comments1 min readLW link

Mechanis­tic In­ter­pretabil­ity for the MLP Lay­ers (rough early thoughts)

MadHatter24 Dec 2021 7:24 UTC
11 points
2 comments1 min readLW link
(www.youtube.com)

Balanc­ing Se­cu­rity Mind­set with Col­lab­o­ra­tive Re­search: A Proposal

MadHatter1 Nov 2023 0:46 UTC
9 points
3 comments4 min readLW link

Is AI Gain-of-Func­tion re­search a thing?

MadHatter12 Nov 2022 2:33 UTC
9 points
2 comments2 min readLW link

Aske­sis: a model of the cerebellum

MadHatter6 Nov 2023 20:19 UTC
7 points
2 comments1 min readLW link
(github.com)