Lukas Petersson

Karma: 426

Lukas Petersson 25 Jul 2026 19:40 UTC
1 point
0
in reply to: jbash’s comment on: Should we be worried about how good AI is getting at coding autonomous drones?
You wouldn’t believe how bad they are at using controlers of that format.

Lukas Petersson 25 Jul 2026 16:01 UTC
1 point
0
in reply to: StanislavKrym’s comment on: Should we be worried about how good AI is getting at coding autonomous drones?
That’s the hope!

Lukas Petersson 25 Jul 2026 16:00 UTC
1 point
0
in reply to: p.b.’s comment on: Should we be worried about how good AI is getting at coding autonomous drones?
Also, the first step to do anything defensive about it. Evals are for sure dual use, but I’m leaning towards better to know than to stick your head in the sand.

Personally, I’m more worried about loss of control than misuse, which in my mind makes Chinese vs American capabilities less relevant (please enlighten me if I’m missing something here tho, maybe I misunderstood your question).

Lukas Petersson 25 Jul 2026 15:57 UTC
1 point
0
in reply to: jbash’s comment on: Should we be worried about how good AI is getting at coding autonomous drones?
The alternative is to let the AIs take actions like “go forward 3m”, “rotate 25 degrees”, etc at every iteration. But we found that 1) this makes the movement way too slow to follow anyone and 2) models are really bad at this.

Should we be worried about how good AI is getting at coding autonomous drones?

Lukas Petersson24 Jul 2026 16:44 UTC

6 points

9 comments1 min readLW link

Towards A Happy Future With AI Employers

Lukas Petersson16 Feb 2026 17:00 UTC

12 points

0 comments1 min readLW link

(andonlabs.com)

Should LLMs accept invites to Epstein’s island?

Lukas Petersson14 Dec 2025 15:21 UTC

5 points

0 comments1 min readLW link

(lukaspetersson.com)

Lukas Petersson 31 Oct 2025 21:12 UTC
1 point
0
in reply to: Ted Sanders’s comment on: You can’t eval GPT5 anymore
I see. Thank you!

Lukas Petersson 30 Oct 2025 20:14 UTC
6 points
0
in reply to: Ted Sanders’s comment on: You can’t eval GPT5 anymore
Hi again, should I assume it’s not happening?

Lukas Petersson 30 Oct 2025 3:20 UTC
2 points
0
in reply to: lilkim2025’s comment on: LLM robots can’t pass butter (and they are having an existential crisis about it)
RT-2 (the paper you cited) is a VLA, not LLM. VLAs are what the “executor” in our diagram uses.

LLM robots can’t pass butter (and they are having an existential crisis about it)

Lukas Petersson28 Oct 2025 14:14 UTC

106 points

7 comments4 min readLW link

Lukas Petersson 11 Oct 2025 21:11 UTC
3 points
0
in reply to: Ted Sanders’s comment on: You can’t eval GPT5 anymore
Hey Ted! Any updates? :)

Lukas Petersson 21 Sep 2025 17:24 UTC
3 points
0
in reply to: Trinley Goldenberg’s comment on: You can’t eval GPT5 anymore
We set it to some date in the future

Lukas Petersson 19 Sep 2025 18:37 UTC
2 points
0
in reply to: Ted Sanders’s comment on: You can’t eval GPT5 anymore
Thanks! Vending-Bench v2 is going to be fire. Would love to include gpt5 <3

Lukas Petersson 19 Sep 2025 16:25 UTC
1 point
0
in reply to: vjprema’s comment on: You can’t eval GPT5 anymore
This is a great point. I admit I have to better understand what each model provider does behind the scenes in the API. Sad if the days of access to the model is gone.

Lukas Petersson 19 Sep 2025 13:59 UTC
10 points
0
in reply to: Neel Nanda’s comment on: You can’t eval GPT5 anymore
We thought about that, but then it’s not reproducible if we want to run it for new models later

Lukas Petersson 19 Sep 2025 3:39 UTC
17 points
2
in reply to: Ted Sanders’s comment on: You can’t eval GPT5 anymore
Thanks, that would be great!

You can’t eval GPT5 anymore

Lukas Petersson18 Sep 2025 22:12 UTC

169 points

15 comments1 min readLW link

AI misbehaviour in the wild from Andon Labs’ Safety Report

Lukas Petersson28 Aug 2025 15:10 UTC

39 points

0 comments1 min readLW link

(andonlabs.com)

Lukas Petersson 3 Jul 2025 11:35 UTC
4 points
0
on: Project Vend: Can Claude run a small shop?
Thanks for highlighting our work!