RSS

jorio

Karma: 114

MATS 8.0 Scholar

Con­cept Poi­son­ing: Prob­ing LLMs with­out probes

5 Aug 2025 17:00 UTC
60 points
5 comments13 min readLW link

Selec­tive Gen­er­al­iza­tion: Im­prov­ing Ca­pa­bil­ities While Main­tain­ing Alignment

16 Jul 2025 21:25 UTC
67 points
4 comments7 min readLW link