RSS

Igor Ivanov

Karma: 648

I repli­cated the An­thropic al­ign­ment fak­ing ex­per­i­ment on other mod­els, and they didn’t fake alignment

May 30, 2025, 6:57 PM
20 points
0 comments2 min readLW link

It’s hard to make schem­ing evals look re­al­is­tic for LLMs

May 24, 2025, 7:17 PM
141 points
27 comments5 min readLW link

LLMs can strate­gi­cally de­ceive while do­ing gain-of-func­tion re­search

Igor IvanovJan 24, 2024, 3:45 PM
36 points
4 comments11 min readLW link

Psy­chol­ogy of AI doomers and AI optimists

Igor IvanovDec 28, 2023, 5:55 PM
3 points
0 comments22 min readLW link

5 psy­cholog­i­cal rea­sons for dis­miss­ing x-risks from AGI

Igor IvanovOct 26, 2023, 5:21 PM
24 points
6 comments4 min readLW link