RSS

carissacullen

Karma: 13

De­tect­ing col­lu­sion through multi-agent interpretability

3 Apr 2026 9:17 UTC
15 points
1 comment6 min readLW link

De­tect­ing col­lu­sion through multi-agent interpretability

2 Apr 2026 22:20 UTC
2 points
0 comments5 min readLW link