RSS

Stuart_Armstrong(Stuart Armstrong)

Karma: 17,677

Us­ing GPT-Eliezer against ChatGPT Jailbreaking

6 Dec 2022 19:54 UTC
170 points
85 comments9 min readLW link

The AI in a box boxes you

Stuart_Armstrong2 Feb 2010 10:10 UTC
168 points
389 comments1 min readLW link

Assess­ing Kurzweil pre­dic­tions about 2019: the results

Stuart_Armstrong6 May 2020 13:36 UTC
145 points
21 comments4 min readLW link

Just an­other day in utopia

Stuart_Armstrong25 Dec 2011 9:37 UTC
142 points
118 comments13 min readLW link

Assess­ing Kurzweil: the results

Stuart_Armstrong16 Jan 2013 16:51 UTC
97 points
64 comments2 min readLW link

The Ad­ven­ture: a new Utopia story

Stuart_Armstrong5 Feb 2020 16:50 UTC
97 points
37 comments51 min readLW link

mAIry’s room: AI rea­son­ing to solve philo­soph­i­cal problems

Stuart_Armstrong5 Mar 2019 20:24 UTC
87 points
41 comments6 min readLW link2 reviews

An­thropic sig­na­ture: strange anti-correlations

Stuart_Armstrong21 Oct 2014 16:59 UTC
83 points
25 comments1 min readLW link

Bench­mark for suc­cess­ful con­cept ex­trap­o­la­tion/​avoid­ing goal misgeneralization

Stuart_Armstrong4 Jul 2022 20:48 UTC
82 points
12 comments4 min readLW link

The Gold­bach con­jec­ture is prob­a­bly cor­rect; so was Fer­mat’s last theorem

Stuart_Armstrong14 Jul 2020 19:30 UTC
80 points
27 comments4 min readLW link

Model splin­ter­ing: mov­ing from one im­perfect model to another

Stuart_Armstrong27 Aug 2020 11:53 UTC
79 points
10 comments33 min readLW link

Com­plete­ness, in­com­plete­ness, and what it all means: first ver­sus sec­ond or­der logic

Stuart_Armstrong16 Jan 2012 17:38 UTC
79 points
39 comments11 min readLW link

AI timeline pre­dic­tions: are we get­ting bet­ter?

Stuart_Armstrong17 Aug 2012 7:07 UTC
79 points
81 comments4 min readLW link

Siren wor­lds and the per­ils of over-op­ti­mised search

Stuart_Armstrong7 Apr 2014 11:00 UTC
77 points
418 comments7 min readLW link

The Oc­to­pus, the Dolphin and Us: a Great Filter tale

Stuart_Armstrong3 Sep 2014 21:37 UTC
76 points
236 comments3 min readLW link

“But that’s your job”: why or­gani­sa­tions can work

Stuart_Armstrong5 Feb 2020 12:25 UTC
76 points
12 comments4 min readLW link

And the AI would have got away with it too, if...

Stuart_Armstrong22 May 2019 21:35 UTC
75 points
7 comments1 min readLW link

Let’s split the cake, length­wise, up­wise and slantwise

Stuart_Armstrong25 Oct 2010 13:15 UTC
74 points
29 comments4 min readLW link

To re­duce as­tro­nom­i­cal waste: take your time, then go very fast

Stuart_Armstrong13 Jul 2013 16:41 UTC
70 points
50 comments3 min readLW link

Re­search Agenda v0.9: Syn­the­sis­ing a hu­man’s prefer­ences into a util­ity function

Stuart_Armstrong17 Jun 2019 17:46 UTC
70 points
26 comments33 min readLW link