simeon_c

Karma: 1,442

@SaferAI

[Question] Are There Effective Interventions to Increase Distress Tolerance?

simeon_c25 Jan 2026 18:50 UTC

9 points

1 comment1 min readLW link

A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management

simeon_c and Henry Papadatos

13 Mar 2025 18:29 UTC

10 points

0 comments1 min readLW link

(arxiv.org)

Towards Quantitative AI Risk Management

Henry Papadatos and simeon_c

16 Oct 2024 19:26 UTC

28 points

1 comment6 min readLW link

simeon_c’s Shortform

simeon_c4 Apr 2024 9:01 UTC

5 points

79 comments1 min readLW link

Forecasting future gains due to post-training enhancements

elifland, Joel Becker and simeon_c

8 Mar 2024 2:11 UTC

31 points

2 comments1 min readLW link

(docs.google.com)

Davidad’s Provably Safe AI Architecture—ARIA’s Programme Thesis

simeon_c1 Feb 2024 21:30 UTC

69 points

17 comments1 min readLW link

(www.aria.org.uk)

A Brief Assessment of OpenAI’s Preparedness Framework & Some Suggestions for Improvement

simeon_c22 Jan 2024 20:08 UTC

14 points

0 comments6 min readLW link

(uploads-ssl.webflow.com)

Responsible Scaling Policies Are Risk Management Done Wrong

simeon_c25 Oct 2023 23:46 UTC

123 points

35 comments22 min readLW link 1 review

(www.navigatingrisks.ai)

[Question] Do LLMs Implement NLP Algorithms for Better Next Token Predictions?

simeon_c19 Sep 2023 12:28 UTC

5 points

1 comment1 min readLW link

[Question] In the Short-Term, Why Couldn’t You Just RLHF-out Instrumental Convergence?

simeon_c16 Sep 2023 10:44 UTC

21 points

6 comments1 min readLW link

AGI x Animal Welfare: A High-EV Outreach Opportunity?

simeon_c28 Jun 2023 20:44 UTC

29 points

0 comments1 min readLW link

The Cruel Trade-Off Between AI Misuse and AI X-risk Concerns

simeon_c22 Apr 2023 13:49 UTC

24 points

1 comment2 min readLW link

AI Takeover Scenario with Scaled LLMs

simeon_c16 Apr 2023 23:28 UTC

42 points

15 comments8 min readLW link

Navigating AI Risks (NAIR) #1: Slowing Down AI

simeon_c14 Apr 2023 14:35 UTC

11 points

3 comments1 min readLW link

(navigatingairisks.substack.com)

Request to AGI organizations: Share your views on pausing AI progress

Orpheus16 and simeon_c

11 Apr 2023 17:30 UTC

141 points

11 comments1 min readLW link

[Question] Could Simulating an AGI Taking Over the World Actually Lead to a LLM Taking Over the World?

simeon_c13 Jan 2023 6:33 UTC

15 points

1 comment1 min readLW link

[Linkpost] DreamerV3: A General RL Architecture

simeon_c12 Jan 2023 3:55 UTC

23 points

3 comments1 min readLW link

(arxiv.org)

[Question] Are Mixture-of-Experts Transformers More Interpretable Than Dense Transformers?

simeon_c31 Dec 2022 11:34 UTC

8 points

6 comments1 min readLW link

AGI Timelines in Governance: Different Strategies for Different Timeframes

simeon_c and AmberDawn

19 Dec 2022 21:31 UTC

65 points

28 comments10 min readLW link

Extracting and Evaluating Causal Direction in LLMs’ Activations

Fabien Roger and simeon_c

14 Dec 2022 14:33 UTC

29 points

5 comments11 min readLW link