RSS

Filip Sondej

Karma: 411

Currently working on LLM unlearning. Also interested in CoT faithfulness, AI welfare/​rights and mitigating AI conflict.

github.com/​​filyp

scholar.google.pl/​citations?hl=pl&user=oNsrQNcAAAAJ

Un­learn­ing Needs to be More Selec­tive [Progress Re­port]

27 Jun 2025 16:38 UTC
24 points
6 comments3 min readLW link

In­finite money hack

Filip Sondej24 Jun 2025 9:39 UTC
3 points
6 comments1 min readLW link

How LLM Beliefs Change Dur­ing Chain-of-Thought Reasoning

16 Jun 2025 16:18 UTC
30 points
2 comments5 min readLW link

Sim­ple Stegano­graphic Com­pu­ta­tion Eval—gpt-4o and gem­ini-exp-1206 can’t solve it yet

Filip Sondej19 Dec 2024 15:47 UTC
13 points
2 comments3 min readLW link