RSS

Vivek Hebbar

Karma: 1,226

When does train­ing a model change its goals?

Jun 12, 2025, 6:43 PM
70 points
2 comments15 min readLW link

Poli­ti­cal syco­phancy as a model or­ganism of scheming

May 12, 2025, 5:49 PM
40 points
0 comments14 min readLW link

How can we solve diffuse threats like re­search sab­o­tage with AI con­trol?

Vivek HebbarApr 30, 2025, 7:23 PM
52 points
1 comment8 min readLW link

How train­ing-gamers might func­tion (and win)

Vivek HebbarApr 11, 2025, 9:26 PM
107 points
5 comments13 min readLW link

Differ­ent senses in which two AIs can be “the same”

Jun 24, 2024, 3:16 AM
69 points
2 comments4 min readLW link