RSS

simeon_c

Karma: 1,064

@SaferAI

Re­spon­si­ble Scal­ing Poli­cies Are Risk Man­age­ment Done Wrong

simeon_c25 Oct 2023 23:46 UTC
114 points
33 comments22 min readLW link
(www.navigatingrisks.ai)

New GPT3 Im­pres­sive Ca­pa­bil­ities—In­struc­tGPT3 [1/​2]

simeon_c13 Mar 2022 10:58 UTC
72 points
10 comments7 min readLW link

Davi­dad’s Prov­ably Safe AI Ar­chi­tec­ture—ARIA’s Pro­gramme Thesis

simeon_c1 Feb 2024 21:30 UTC
69 points
17 comments1 min readLW link
(www.aria.org.uk)

AGI Timelines in Gover­nance: Differ­ent Strate­gies for Differ­ent Timeframes

19 Dec 2022 21:31 UTC
63 points
28 comments10 min readLW link

AI Takeover Sce­nario with Scaled LLMs

simeon_c16 Apr 2023 23:28 UTC
42 points
15 comments8 min readLW link

AGI x An­i­mal Welfare: A High-EV Outreach Op­por­tu­nity?

simeon_c28 Jun 2023 20:44 UTC
29 points
0 comments1 min readLW link

The Cruel Trade-Off Between AI Mi­suse and AI X-risk Concerns

simeon_c22 Apr 2023 13:49 UTC
24 points
1 comment2 min readLW link

[Linkpost] Dream­erV3: A Gen­eral RL Architecture

simeon_c12 Jan 2023 3:55 UTC
23 points
3 comments1 min readLW link
(arxiv.org)

[Question] In the Short-Term, Why Couldn’t You Just RLHF-out In­stru­men­tal Con­ver­gence?

simeon_c16 Sep 2023 10:44 UTC
21 points
6 comments1 min readLW link

[Question] Could Si­mu­lat­ing an AGI Tak­ing Over the World Ac­tu­ally Lead to a LLM Tak­ing Over the World?

simeon_c13 Jan 2023 6:33 UTC
15 points
1 comment1 min readLW link

A Brief Assess­ment of OpenAI’s Pre­pared­ness Frame­work & Some Sugges­tions for Improvement

simeon_c22 Jan 2024 20:08 UTC
14 points
0 comments6 min readLW link
(uploads-ssl.webflow.com)

Nav­i­gat­ing AI Risks (NAIR) #1: Slow­ing Down AI

simeon_c14 Apr 2023 14:35 UTC
11 points
3 comments1 min readLW link
(navigatingairisks.substack.com)

Is GPT3 a Good Ra­tion­al­ist? - In­struc­tGPT3 [2/​2]

simeon_c7 Apr 2022 13:46 UTC
11 points
0 comments7 min readLW link

[Question] Are Mix­ture-of-Ex­perts Trans­form­ers More In­ter­pretable Than Dense Trans­form­ers?

simeon_c31 Dec 2022 11:34 UTC
7 points
5 comments1 min readLW link

[Question] Do LLMs Im­ple­ment NLP Al­gorithms for Bet­ter Next To­ken Pre­dic­tions?

simeon_c19 Sep 2023 12:28 UTC
5 points
1 comment1 min readLW link