alexlyzhov

Karma: 409

alexlyzhov Aug 3, 2025, 5:39 PM
7 points
4 votes
Overall karma indicates overall quality.
3
1 vote
Agreement karma indicates agreement, separate from overall quality.
in reply to: Kaj_Sotala’s comment on: I am worried about near-term non-LLM AI developments
They trained on only the train examples of the train and validation puzzle sets, which is fair.

Yes, I agree—I wasn’t implying that’s foul play. I just thought it’s less impressive than I thought because:
- It’s finetuning on task examples and not in-context few-shot learning
- They finetune on the harder evaluation set and not only on easier train set, so they don’t demonstrate generalization across the easy->hard distribution shift
- The result I linked to was 20% on ARC-AGI-1 by only fitting examples for 1 evaluation task using an MLP-type network vs the 40% result in the paper using 1000 tasks. These numbers are not directly comparable because they did a fair bit of custom architectural engineering to reach 20%, but it really put 40% in perspective for me.

alexlyzhov Aug 3, 2025, 6:41 AM
17 points
7 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: I am worried about near-term non-LLM AI developments
I was very impressed with the ARC-AGI results so I read the entire paper and also browsed the code a fair amount.

Only after browsing the code I realized that they likely train on all evaluation tasks in addition to training tasks—correct me if I’m wrong. During inference they only condition on x* and on the task embedding to predict y*, instead of on (x,y)_{1..3}. The only way they could get that task embedding is by training it. Evaluation tasks are harder than those in the train set.

They score a lot higher than the baseline transformer so clearly there’s a lot of merit in what they’re doing. But in the setting of training on evaluation tasks you can train on only 1 target ARC-AGI-1 task instead of 1000 tasks and still get 20% accuracy: https://iliao2345.github.io/blog_posts/arc_agi_without_pretraining/arc_agi_without_pretraining.html. Given this, it doesn’t look earth-shattering.
What links here?
- Noosphere89's comment on I am worried about near-term non-LLM AI developments by testingthewaters (Aug 5, 2025, 7:54 PM; 11 points)

alexlyzhov Jul 5, 2023, 8:39 PM
5 points
3 votes
Overall karma indicates overall quality.
2
1 vote
Agreement karma indicates agreement, separate from overall quality.
in reply to: Holly_Elmore’s comment on: Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?
For every token, model activations are computed once when the token is encountered and then never explicitly revised → “only [seems like it] goes in one direction”

alexlyzhov Jul 4, 2023, 11:52 PM
9 points
6 votes
Overall karma indicates overall quality.
2
1 vote
Agreement karma indicates agreement, separate from overall quality.
in reply to: Garrett Baker’s comment on: Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?

with the only recursive element of its thought being that it can pass 16 bits to its next running

I would name activations for all previous tokens as the relevant “element of thought” here that gets passed, and this can be gigabytes.

From how the quote looks, I think his gripe is with the possibility of in-context learning, where human-like learning happens without anything about how the network works (neither its weights nor previous token states) being ostensibly updated.

alexlyzhov Jun 13, 2023, 4:09 PM
2 points
2 votes
Overall karma indicates overall quality.
1
1 vote
Agreement karma indicates agreement, separate from overall quality.
on: Aura as a proprioceptive glitch
Among them, one I found especially peculiar is that I distinctly started feeling some sort of sensations outside of my body.
I had this, and it lasted for a year after the retreat. I also found that there’s a strong tendency for the sensations to happen in the area you described.
I could feel sensations substantially outside of the area accessible to my hands too, but they were a bit more difficult to feel. They could correspond to priors for tactile-like affordances for objects at a distance (e.g. graspability of a cup, or speed of a fast-moving vehicle) that are readily constructed by ordinary perception.

alexlyzhov May 27, 2023, 4:25 AM
3 points
2 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: Bandgaps, Brains, and Bioweapons: The limitations of computational science and what it means for AGI
Seems related to https://www.lesswrong.com/posts/qpgkttrxkvGrH9BRr/superintelligence-is-not-omniscience, https://www.lesswrong.com/posts/epgCXiv3Yy3qgcsys/you-can-t-predict-a-game-of-pinball, and similar objections might be applicable.

Papers on protein design

alexlyzhovMay 27, 2023, 1:18 AM

9 points

5 votes

Overall karma indicates overall quality.

0 comments3 min readLW link

alexlyzhov May 18, 2023, 7:09 AM
3 points
3 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: Thriving in the Weird Times: Preparing for the 100X Economy
I thought a bit about datasets before and to me it seems like what needs collecting most is detailed personal preference datasets. E.g. input-output examples of how you generally prefer information to be filtered, processed, communicated to you, refined with your inputs; what are your success criteria for tasks, where are the places in your day flow / thought flow where the thing needs to actively intervene and correct you. Especially in those places where you feel you can benefit from cognitive extensions most, based on your bottlenecks. It could initially be too hard to infer from screen logs alone.

alexlyzhov May 11, 2023, 7:38 PM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: alexlyzhov’s Shortform
Random idea about preventing model stealing. After finetuning a mixture of experts model with your magic sauce, place the trained experts on geographically distinct servers with heterogeneous tech stacks and security systems to avoid common vulnerabilities. Horcrux vibes

alexlyzhov May 11, 2023, 12:17 AM
2 points
2 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: AI interpretability could be harmful?
Vaguely related paper: Self-Destructing Models: Increasing the Costs of Harmful Dual Uses in Foundation Models is an early attempt to prevent models from being re-purposed via fine-tuning.
It doesn’t seem like a meaningfully positive result. For example, all their plots only track finetuning on up to 200 examples. I imagine they might have even had clear negative results in conditions with >200 examples available for finetuning. After 50-100 examples, the gap between normal finetuning and finetuning from random init, even though still small, grows fast. There are also no plots with x-axis = finetuning iterations. When they optimize for “non-finetunability”, they don’t aim to maintain the language modeling performance, instead, they only impose the constraint of “maintaining finetunability” on one downstream “professions detection task”.
I expect naive solutions to continue to work very poorly on this problem.

alexlyzhov May 7, 2023, 12:25 AM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: Archimedes’s comment on: Clarifying and predicting AGI
I think “on most cognitive tasks” means for an AGI its t is defined as the first t for which it meets the expert level at most tasks. However, what exactly counts as a cognitive task does seem to introduce ambiguity and would be cool to clarify, e.g. by pointing to a clear protocol for sampling all such task descriptions from an LLM.

alexlyzhov May 7, 2023, 12:11 AM
3 points
2 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: the gears to ascension’s comment on: Clarifying and predicting AGI
Several-months-AGI is required to be coherent in the sense of coherence defined with human experts today. I think this is pretty distinct from coherence that humans were being optimized to have before behavioral modernity (50K years ago).

I agree that evolution optimized hard for some kind of coherence, like persistent self-schema, attitudes, emotional and behavioral patterns, attachments, long-term memory access. But what humans have going for them is the combination of this prior coherence and just 50K years of evolution after humans unlocked access to the abstract thinking toolkit. I don’t think we can expect it to enable much in terms of to the ability to coherently plan to do complex tasks or to the ability to write and reason abstractly.

This makes me think humans struggling at coherence is not good evidence for building agents with large t being much more difficult compared to small t: there wasn’t enough optimization pressure.

alexlyzhov May 6, 2023, 10:31 PM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: Gerald Monroe’s comment on: Clarifying and predicting AGI

on most cognitive tasks, it beats most human experts

I think this specifies both thresholds to be 50%.

alexlyzhov Mar 19, 2023, 10:00 PM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: Raemon’s comment on: Sam Altman: “Planning for AGI and beyond”
It doesn’t seem like “shorter timelines” in the safest quadrant has much to do with their current strategy, as they have a gpt-4 paper section on how they postponed the release to reduce acceleration.

alexlyzhov Feb 27, 2023, 3:43 AM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
on: The Preference Fulfillment Hypothesis
https://www.lesswrong.com/posts/WKGZBCYAbZ6WGsKHc/love-in-a-simbox-is-all-you-need was vaguely similar

alexlyzhov Feb 26, 2023, 9:15 PM
2 points
2 votes
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: Kaj_Sotala’s comment on: The Preference Fulfillment Hypothesis
Relevant recent paper: https://www.lesswrong.com/posts/8F4dXYriqbsom46x5/pretraining-language-models-with-human-preferences

alexlyzhov Feb 17, 2023, 7:19 PM
7 points
6 votes
Overall karma indicates overall quality.
4
3 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: gwern’s comment on: Bing Chat is blatantly, aggressively misaligned

why it is so good in general (GPT-4)

What are the examples indicating it’s at the level of performance at complex tasks you would expect from GPT-4? Especially performance which is clearly attributable to improvements that we expect to be made in GPT-4? I looked through a bunch of screenshots but haven’t seen any so far.

alexlyzhov Feb 8, 2023, 3:19 AM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: LawrenceC’s comment on: SolidGoldMagikarp (plus, prompt generation)
Can confirm I consistently had non-deterministic temp-0 completions on older davinci models accessed through the API last year.

alexlyzhov Jan 11, 2023, 2:44 AM
4 points
3 votes
Overall karma indicates overall quality.
2
1 vote
Agreement karma indicates agreement, separate from overall quality.
on: [Rumour] Microsoft to invest $10B in OpenAI, will receive 75% of profits until they recoup investment: GPT would be integrated with Office
Bloomberg reported on plans to invest $10B today

alexlyzhov Dec 27, 2022, 5:28 PM
1 point
1 vote
Overall karma indicates overall quality.
0
0 votes
Agreement karma indicates agreement, separate from overall quality.
in reply to: Tamsin Leake’s comment on: carado’s Shortform
Have you seen this implemented in any blogging platform other people can use? I’d love to see this feature implemented in some Obsidian publishing solution like quartz, but for now they mostly don’t care about access management.

alexlyzhov

Papers on pro­tein design

Papers on protein design