RSS

Nathan Helm-Burger

Karma: 4,842

AI alignment researcher, ML engineer. Masters in Neuroscience.

I believe that cheap and broadly competent AGI is attainable and will be built soon. This leads me to have timelines of around 2024-2027. Here’s an interview I gave recently about my current research agenda. I think the best path forward to alignment is through safe, contained testing on models designed from the ground up for alignability trained on censored data (simulations with no mention of humans or computer technology). I think that current ML mainstream technology is close to a threshold of competence beyond which it will be capable of recursive self-improvement, and I think that this automated process will mine neuroscience for insights, and quickly become far more effective and efficient. I think it would be quite bad for humanity if this happened in an uncontrolled, uncensored, un-sandboxed situation. So I am trying to warn the world about this possibility.

See my prediction markets here:

https://​​manifold.markets/​​NathanHelmBurger/​​will-gpt5-be-capable-of-recursive-s?r=TmF0aGFuSGVsbUJ1cmdlcg

I also think that current AI models pose misuse risks, which may continue to get worse as models get more capable, and that this could potentially result in catastrophic suffering if we fail to regulate this.

I now work for SecureBio on AI-Evals.

relevant quotes:

“There is a powerful effect to making a goal into someone’s full-time job: it becomes their identity. Safety engineering became its own subdiscipline, and these engineers saw it as their professional duty to reduce injury rates. They bristled at the suggestion that accidents were largely unavoidable, coming to suspect the opposite: that almost all accidents were avoidable, given the right tools, environment, and training.” https://​​www.lesswrong.com/​​posts/​​DQKgYhEYP86PLW7tZ/​​how-factories-were-made-safe

“The prospect for the human race is sombre beyond all precedent. Mankind are faced with a clear-cut alternative: either we shall all perish, or we shall have to acquire some slight degree of common sense. A great deal of new political thinking will be necessary if utter disaster is to be averted.”—Bertrand Russel, The Bomb and Civilization 1945.08.18

“For progress, there is no cure. Any attempt to find automatically safe channels for the present explosive variety of progress must lead to frustration. The only safety possible is relative, and it lies in an intelligent exercise of day-to-day judgment.”—John von Neumann

“I believe that the creation of greater than human intelligence will occur during the next thirty years. (Charles Platt has pointed out the AI enthusiasts have been making claims like this for the last thirty years. Just so I’m not guilty of a relative-time ambiguity, let me more specific: I’ll be surprised if this event occurs before 2005 or after 2030.)”—Vernor Vinge, Singularity

Un­faith­ful Rea­son­ing Can Fool Chain-of-Thought Monitoring

2 Jun 2025 19:08 UTC
78 points
17 comments3 min readLW link

Proac­tive ‘If-Then’ Safety Cases

Nathan Helm-Burger18 Nov 2024 21:16 UTC
10 points
0 comments4 min readLW link

A path to hu­man autonomy

Nathan Helm-Burger29 Oct 2024 3:02 UTC
53 points
16 comments20 min readLW link

My hopes for YouCongress.com

Nathan Helm-Burger22 Sep 2024 3:20 UTC
14 points
3 comments4 min readLW link

Physics of Lan­guage mod­els (part 2.1)

Nathan Helm-Burger19 Sep 2024 16:48 UTC
9 points
2 comments1 min readLW link
(youtu.be)

Avoid­ing the Bog of Mo­ral Hazard for AI

Nathan Helm-Burger13 Sep 2024 21:24 UTC
19 points
13 comments2 min readLW link

A bet for Samo Burja

Nathan Helm-Burger5 Sep 2024 16:01 UTC
14 points
2 comments2 min readLW link

Diffu­sion Guided NLP: bet­ter steer­ing, mostly a good thing

Nathan Helm-Burger10 Aug 2024 19:49 UTC
13 points
0 comments1 min readLW link
(arxiv.org)

Im­bue (Gen­er­ally In­tel­li­gent) con­tinue to make progress

Nathan Helm-Burger26 Jun 2024 20:41 UTC
18 points
0 comments1 min readLW link
(imbue.com)

Se­cret US nat­sec pro­ject with in­tel revealed

Nathan Helm-Burger25 May 2024 4:22 UTC
27 points
1 comment1 min readLW link
(www.politico.com)

[Question] Con­stituency-sized AI congress?

Nathan Helm-Burger9 Feb 2024 16:01 UTC
11 points
5 comments1 min readLW link

Gun­pow­der as metaphor for AI

Nathan Helm-Burger28 Dec 2023 4:31 UTC
14 points
0 comments2 min readLW link

Digi­tal hu­mans vs merge with AI? Same or differ­ent?

6 Dec 2023 4:56 UTC
21 points
11 comments7 min readLW link

Desider­ata for an AI

Nathan Helm-Burger19 Jul 2023 16:18 UTC
9 points
0 comments4 min readLW link

An at­tempt to steel­man OpenAI’s al­ign­ment plan

Nathan Helm-Burger13 Jul 2023 18:25 UTC
22 points
0 comments4 min readLW link

Two paths to win the AGI transition

Nathan Helm-Burger6 Jul 2023 21:59 UTC
11 points
8 comments4 min readLW link

Nice in­tro video to RSI

Nathan Helm-Burger16 May 2023 18:48 UTC
12 points
0 comments1 min readLW link
(youtu.be)

Will GPT-5 be able to self-im­prove?

Nathan Helm-Burger29 Apr 2023 17:34 UTC
18 points
22 comments3 min readLW link

[Question] Can GPT-4 play 20 ques­tions against an­other in­stance of it­self?

Nathan Helm-Burger28 Mar 2023 1:11 UTC
15 points
1 comment1 min readLW link
(evanthebouncy.medium.com)

Fea­ture idea: ex­tra info about post au­thor’s re­sponse to com­ments.

Nathan Helm-Burger23 Mar 2023 20:14 UTC
6 points
0 comments1 min readLW link