RSS

Jessica Rumbelow

Karma: 1,060

AI interpretability researcher

In­tro­duc­ing Leap Labs, an AI in­ter­pretabil­ity startup

Jessica Rumbelow6 Mar 2023 16:16 UTC
99 points
11 comments1 min readLW link

SolidGoldMag­ikarp III: Glitch to­ken archaeology

14 Feb 2023 10:17 UTC
90 points
30 comments16 min readLW link

SolidGoldMag­ikarp II: tech­ni­cal de­tails and more re­cent findings

6 Feb 2023 19:09 UTC
109 points
45 comments13 min readLW link

SolidGoldMag­ikarp (plus, prompt gen­er­a­tion)

5 Feb 2023 22:02 UTC
663 points
204 comments12 min readLW link

Guardian AI (Misal­igned sys­tems are all around us.)

Jessica Rumbelow25 Nov 2022 15:55 UTC
15 points
6 comments2 min readLW link

The Ground Truth Prob­lem (Or, Why Eval­u­at­ing In­ter­pretabil­ity Meth­ods Is Hard)

Jessica Rumbelow17 Nov 2022 11:06 UTC
27 points
2 comments2 min readLW link

Why I’m Work­ing On Model Ag­nos­tic Interpretability

Jessica Rumbelow11 Nov 2022 9:24 UTC
26 points
9 comments2 min readLW link