RSS

josh :)

Karma: 36

MS in AI at UT Austin. Interested in interpretability and model self-knowledge.

I am open to opportunities :)

Twitter: @joshycodes
Blog: joshfonseca.com/​​blog

Steer­ing Aware­ness: Models Can Be Trained to De­tect Ac­ti­va­tion Steering

12 Mar 2026 23:34 UTC
15 points
0 comments6 min readLW link

A Sim­ple Method for Ac­cel­er­at­ing Grokking

josh :)24 Jan 2026 3:19 UTC
13 points
1 comment3 min readLW link

Train­ing Models to De­tect Ac­ti­va­tion Steer­ing: Re­sults and Implications

josh :)26 Nov 2025 14:51 UTC
11 points
0 comments4 min readLW link