Andrii Vasylenko

Karma: −5

Andrii Vasylenko 22 Apr 2026 6:41 UTC
0 points
0
in reply to: zroe1’s comment on: Quality Matters Most When Stakes are Highest
Has anyone thought of putting a prize on the alignment problem? I imagine the application form would be swamped by cranks/LLM slop but that feels solvable (e.g. requiring a small fee for each submission to pay the evaluator, requiring >1k LW karma).

Andrii Vasylenko 22 Apr 2026 6:34 UTC
3 points
0
on: Quality Matters Most When Stakes are Highest
Very true.
In my own experience, the feeling of urgency actively detracts from useful AI alignment work. Instead of forward-chaining towards a better understanding of intelligence/corrigibility/how-minds-work/etc you end up back-chaining from any kind of results you imagine wold maybe do something.
Which isn’t at all the right frame for approaching the problem, since the effectiveness of what you get is limited by your first thought about it.
On a related note, I think the overwhelming majority of alignment work isn’t helping to address the core problems. Even great researchers like John Wentsworth can somewhat miss the point.

(edited: grammar fix)

Andrii Vasylenko 22 Apr 2026 6:06 UTC
−8 points
−3
on: Preventing extinction from ASI on a $50M yearly budget
I don’t think money is the biggest problem here. MIRI, ControlAI, etc are working on budgets of millions of dollars a year (a lot if used well; Lightcone is one of the most influential orgs around, at a budget of 2-3M), but it doesn’t feel like an effective pause treaty is on the horizon (25% chance that it happens in 1 year).
I’ve never thoroughly researched or worked in policy but it feels like something else is the main bottleneck. Maybe important people not viscerally understanding the situation? I don’t feel like ad campaigns would help with that much.
ControlAI seems to doing good on the margin but it doesn’t seem worth allocating on the order of 50M to it.

Andrii Vasylenko 20 Apr 2026 23:26 UTC
1 point
0
on: Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals
What does an agent which has “only instrumental goals” mean? What is it optimizing for? It doesn’t seem to me like a proposal specific enough to consider whether an ASI optimizing for it would lead to good outcomes, let alone being able to build an ASI with that goal (which seems to be the harder part of the problem). If you have a more specific idea for this though then I’d like to see it.

Andrii Vasylenko 19 Dec 2025 2:39 UTC
3 points
0
on: Announcing RoastMyPost: LLMs Eval Blog Posts and More
This works quite well, and I will continue using it for the time being. However, I strongly suggest disabling “Fallacy Check” though, since it very often fires on nonfallacious content.