James Sullivan

Karma: 58

I’m a software engineer that is interested in AI, futurism, space, and the big questions of life.

https://www.linkedin.com/in/jamessullivan092/

What Reasoning Steps Cause Alignment Faking?

James Sullivan28 Apr 2026 4:37 UTC

3 points

0 comments9 min readLW link

(open.substack.com)

Are we aligning the model or just its mask?

James Sullivan27 Mar 2026 2:10 UTC

11 points

0 comments10 min readLW link

(substack.com)

Playing Dumb: Detecting Sandbagging in Frontier LLMs via Consistency Checks

James Sullivan13 Jan 2026 19:28 UTC

11 points

0 comments5 min readLW link

Jailbreaking Claude 4 and Other Frontier Language Models

James Sullivan15 Jun 2025 0:31 UTC

1 point

0 comments3 min readLW link

(open.substack.com)

How do AI agents work together when they can’t trust each other?

James Sullivan6 Jun 2025 3:10 UTC

17 points

0 comments8 min readLW link

(jamessullivan092.substack.com)

Developmental Stages in Multi-Problem Grokking

James Sullivan29 Sep 2024 18:58 UTC

4 points

0 comments6 min readLW link