Aleksey Bykhun

Karma: 42

founder at yolocode.xyz, helping artists sell art online

Aleksey Bykhun 30 Jan 2026 2:19 UTC
13 points
5
in reply to: plex’s comment on: AI found 12 of 12 OpenSSL zero-days (while curl cancelled its bug bounty)
that also means bug assessors get access to potential exploits even earlier than a team, which means they should be verified and paid well

Aleksey Bykhun 17 Sep 2025 17:17 UTC
3 points
0
on: The Rise of Parasitic AI
(Recall that ChatGPT 4o was released all the way back in May 2024.)
My understanding of the timeline:

Late Oct 2024 – Anthropic releases Claude Sonnet 3.5 (new). It’s REALLY good at EQ. People start talking to it and asking for advice
https://www.anthropic.com/news/3-5-models-and-computer-use

OpenAI is mad – how could they fuck this up? They have to keep up.

https://help.openai.com/en/articles/9624314-model-release-notes#h_826f21517f

They release a series of updates to 4o (Nov 20, Jan 29, Mar 27), trying to invoke similar empathy and emotional realism, which culminates in Mar 2025 when they even had to dial it back down due to twitter complaints

Uncertain: ChatGPT can’t match Sonnet in EQ, cause of the differences between RLHF and RLAIF.

However, it’s “good enough” that people grow emotionally attached to 4o.

OpenAI makes most of the money on their b2c chatgpt.com – Anthropic doesn’t care about b2c as much, they rake in API inference $$$ and claude.ai is like a 5th priority on their list, somewhere after training, alignment, enterprise sales, coding performance

Aleksey Bykhun 14 Mar 2025 4:35 UTC
0 points
0
on: What 2026 looks like (Daniel’s Median Future)
well, how do I play democracy with AI? It’s already 2025

Aleksey Bykhun 11 Mar 2025 6:15 UTC
2 points
0
in reply to: Knight Lee’s comment on: Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
Try asking Claude to how to login under root on your machine. This is completely valid use case, but I spent more than 15 minutes arguing that I am literally already an owner of the machine, I just need correct syntax.
I gave up and Googled it, cause Claude literally said that I’m a hacker and trying to break in and it won’t cooperate

Aleksey Bykhun 11 Mar 2025 4:32 UTC
−3 points
0
on: when will LLMs become human-level bloggers?
plot twist: this post was written by Claude

Aleksey Bykhun 29 Dec 2024 16:57 UTC
5 points
0
on: By default, capital will matter more than ever after AGI
re: post main claim, I think local entrepreneurship would actually thrive
skipping network effects; would you rather use taxi app created by faceless VC or the one created by your neighbour?
(actually it’s not even a fake example, see https://techcrunch.com/2024/07/15/google-backs-indian-open-source-uber-rival-namma-yatri/)
it’s also already happening in the indie hacker space – people would prefer to buy something that’s #buildinpublic versus the same exact product made by google

Aleksey Bykhun 29 Dec 2024 16:51 UTC
2 points
0
on: By default, capital will matter more than ever after AGI
interesting angle: given space travel, we’ll have civilizations on other planets, that can’t communicate fast enough with the mainland. presumanly, social hierarchies would be vastly different, and much more fluid there versus here on Earth

Aleksey Bykhun 8 Jun 2024 8:51 UTC
2 points
−1
in reply to: Matt Vogel’s comment on: My simple AGI investment & insurance strategy
If you don’t believe this, the strategy could be to take on as much debt as possible, and spend the money right now.
(Obviously not a financial advice)

Aleksey Bykhun 23 Apr 2024 3:10 UTC
5 points
0
in reply to: Garrett Baker’s comment on: Transformers Represent Belief State Geometry in their Residual Stream
I have tried to play with Claude – I would ask it to think of a number, drop the hint, and only then print the number. It should have test the ability to have “hidden memory” that’s outside the text.
I expected it to be able to do that, but the hints to be too obvious. Instead, actually it failed multiple times in a row!
Sharing cause I liked the experiment but wasn’t sure if I executed it properly. There might be a way to do more of this.
P.S. I have also tried “print hash, and then preimage” – but this turned out to be even harder for him

Aleksey Bykhun 3 Apr 2024 7:56 UTC
2 points
0
on: Seminyak – ACX Meetups Everywhere Spring 2024
I live in Ubud, but I will try to get there!

Aleksey Bykhun 26 Mar 2023 18:21 UTC
1 point
0
on: Group Debugging
Hi! Sorry, i’m running late

Aleksey Bykhun 18 Mar 2023 2:24 UTC
1 point
2
in reply to: JBlack’s comment on: Ethical AI investments?
...in the sense of making an expected profit from actions that reduce this risk
back of the napkin reasoning is that actually we have to PAY to reduce risk, so there’s no way to make money doing that

Aleksey Bykhun 18 Mar 2023 2:00 UTC
4 points
2
on: Enemies vs Malefactors
After a recent article in NY Times, I realized that it’s a perfect analogy. The smartest people, when motivated by money, get so high that they venture into unsafe territory. They kinda know its unsafe, but even internally it doesn’t feel like crossing the red line.
It’s not even about the strength of characters, when incentives are aligned 99:1 against your biology, you can try to work against it, but you most probably stand no chance.
It takes enormous willpower to quit smoking explicitly because the risks are invisible and so “small”. It’s not only you have to fight against this irresistible urge, BUT there’s also nobody on “your side”, except for intellectual realization, of which you’re not even so sure of.
In the same vein, being a CEO of a big startup, being able to single-handedly choose direction, and getting used to people around you being less smart, less hard-working, less competitive, you start trusting your own decision-process much more. That’s when incentives start to water down through the cracks in the shell. You don’t even remember what feels right anymore, the only thing you know is taking bold actions brings you more power, more money, more dukka. And you do those.

Aleksey Bykhun 15 Mar 2023 3:00 UTC
1 point
0
in reply to: Jonathan_Graehl’s comment on: Looking back on my alignment PhD
Generally I would tweak my brain if it would reliably give me the kind of actions I’d now approve of, while providing at worst the same sort of subjective state as I’d have if managing the same results without the intervention. I wouldn’t care if the center of my actions was different as long as the things I value today were bettered.
Technically, we do this all the time. Reading stuff online, talking to people, we absorb their models of the world, their values and solutions to problems we face.
Hence the Schwartznegger poster on the wall makes you strong, the countryside folks make you peaceful, and friend reminding you “you’re being a jerk right now” makes you calm down

Aleksey Bykhun 5 Mar 2023 9:19 UTC
1 point
0
in reply to: leogao’s comment on: The Waluigi Effect (mega-post)
Do humans have this special token that exist outside language? How would it be encoded in the body?

One interesting candidate is a religions feeling of awe. It kinda works like that — when you’re in that state, you absorb beliefs. Also, social pressure seems to work in a similar way.

Aleksey Bykhun 26 Feb 2023 4:08 UTC
−1 points
−2
in reply to: Lucas Teixeira’s comment on: Sam Altman: “Planning for AGI and beyond”
to (2): (a) Simulators are not agents, (b) mesa-optimizers are still “aligned”
(a) amazing https://astralcodexten.substack.com/p/janus-simulators post, utility function is a wrong way to think about intelligence, humans themselves don’t have any utility function, even the most rational ones
(b) the only example of mesa-optimization we have is evolution, and even that succeeds in alignment, people:
- still want to have kids for the sake of having kids
- the evolution’s biggest objective (thrive and proliferate) is being executed quite well, even “outside training distribution”
yes, there are local counterexamples, but we gonna look on the causes and consequences – and we’re at 8 billion already, effectively destroying or enslaving all the other DNA reproductors

Aleksey Bykhun 7 Jul 2022 7:38 UTC
2 points
0
on: AI Forecasting: One Year In
If everyone is so bad at this, is it a reasonable strategy to just bet against the market even more aggressively, making $ on prediction market platforms?
On a similar note, does it make sense to raise a charity fund and bet a lot of money on “AGI by 2025”, motivating forecasters to produce more reasonable predictions?

Aleksey Bykhun 7 Jul 2022 6:47 UTC
5 points
0
in reply to: Dagon’s comment on: My vision of a good future, part I
My take on wire heading is that I precommit to live in the world which is more detailed and complex (vs more pleasant).
For example, online world of Instagram or heroine addiction is more pleasant, but not complex. Painfully navigating maze of life with its ups and downs is complex, but not always pleasant. Living in a “Matrix” might be pleasant, but essentially the details are missed out because the systems that created these details are essentially more detailed and live in a more detailed world.
On the same note, if 99% of the Earth population “uploads”, and most of the fun stuff gonna happen “in the matrix”, most of the complexity gonna exist there. And even if 1% of contrarians stay outside, their lives might not be as interesting and detailed. So “going out of the matrix” would actually be “running away from reality” in that example.
With wire heading it’s a similar thing. From what I know, actually “nirvana” is a more detailed experience where you notice more and where you can observe subconscious processes directly; that’s why they don’t own you and you become free from “suffering”. Nirvana is not total bliss, from what they say (like heroine, I presume).

(e.g. see discussion on topic of paradises on Qualia Computing between Andres Gomez and Roger Thisdell: https://qualiacomputing.com/2021/11/23/the-supreme-state-unconsciousness-classical-enlightenment-from-the-point-of-view-of-valence-structuralism/)

So yeah I would choose this kind of wire heading that allows me to switch into nirvana. Shinzen Young actually works on research trying to accomplish this even before AGI.

Aleksey Bykhun 1 Jul 2022 14:58 UTC
1 point
0
in reply to: localdeity’s comment on: Do you consider your current, non-superhuman self aligned with “humanity” already?
I don’t think NVC tries to put down an opponent, it’s mostly about how you present your ideas. I think it models an opponent as “he tries to win the debate without thinking about my goals. let me think of both mine and theirs goals, so i’m one step ahead”. Which is a bit prerogative and looking down, but not exactly accusatory

Aleksey Bykhun 26 Jun 2022 8:19 UTC
1 point
0
on: Do you consider your current, non-superhuman self aligned with “humanity” already?
Okay, hold my gluten-free kefir, boys! Please let me say it in full first without arguments, and then I will try to find more relevant links for each claim. I promise it’s relevant.
Introduction – Enlightenment?
Lately, I have been into hardcore mindfulness practices (see book) aimed at reaching “Enlightenment” in the sense on Buddha. There are some people who reliably claim they’ve succeeded and talk about their experience and how to reach there (e.g. see this talk and google each of the fellows if it resonates)
My current mental model of “Enlightenment” is as follows:
Evolutionally, we’ve had developed simple lizard brains first, mostly consisting of “register ⇒ process ⇒ decide ⇒ react” without much thought. Similar to the knee reflex, but sometimes a bit more complicated. Our intellectual minds capable of information processing, memory, superior pattern-matching; they have happened later.
These two systems coexist, and first one possesses second. However, the hardware of our brains has general information processing capabilities, and doesn’t require any “good-bad” instant decision reactionary mechanism. Even though it was “invented” earlier, it’s ad-hoc in the system. My metaphor would be a GPU or an ASIC that short-circuits some of the execution to help CPU process info faster.
However, makes a big difference in your subjective experience whether that first system being used or not. Un-winding this circuitry from your default information processing, which hand-wavily is “conscious attention”, or the “central point”; is what mindfulness is about.
“Enlightenment” is a moment when you relax enough so that your brain starts being able (but not required) to run information flows around the the lizard brain and experiencing sensory stimuli “directly”.
Similar “insight” moment happens when you realize that “money” is just paper, and not the Ultimate Human Value Leaderboard. You still can play along the illusion of money, you still can earn money, you still can enjoy money, but you can never go back to blindly obey what capitalism asks from you.
It should be quite obvious why this is good, but let me re-state again.
- Anxiety goes down and doesn’t control you anymore
- Motivation issues go away, the gap between “I want this to happen” and “I find myself doing different thing” is removed
- You don’t care about status and external judgement anymore
- You become more caring person to others internal states, but it feels freeing instead of locking-down
- You find yourself in a space between stimulus and reaction
- You can research your subjective experience deeper, e.g. find out how does brain constructs things like “time arrow” (answer: it’s lazy-loading)
What does it all have to do with the question?
First answer is alignment becomes easier.
I believe that once we normalize this enlightenment thing, and once it becomes the normal part of human medical care system (or even child development as vaccines); the things we think we value and things we do value will synchronize much more. E.g. there is non-trivial number of examples of people losing their addictions after getting a week of hardcore training in mindfulness (see dhamma.org for signing up, it’s completely free and worldwide).
Personally, for me alignment feels like “remembering” I always cared about other people, but was oblivious of that. It’s like how it’s hard to tune your attention to hear the music when there’s loud noise around you.
It’s like when there’s a sound that bugs you a lot, but you don’t notice it until it stops. In my case, when I noticed the “sound” (like how my actions hurt other people AND that I don’t enjoy them being hurt) I stopped the behavior myself.
Second answer is even more tentative.
I’ll say it anyway, because it’s too big if true. However, again I can’t promise any arguments and verifiable prediction. Read this as an invite to pick my mind further and try to strongman the position.
Love is the default human mode of perception, and it’s informationally/computationally easy.
Most of the “enlightened” people report that if you look close enough, existence consists only of one building block, and that is Pure Universal Love, aka God.
It’s not hidden somewhere or limited, it’s literally everywhere. It’s the same thing as “No-Self” or “True Self”, and “God-realization”. It was there all along and it will exist forever. It is fractally every small piece of reality, and the Reality itself as a whole.
When you really ask yourself what is that you want, and you skip the default “reactionary” answers, you find out that there’s only one course of action that you won’t regret and that you will genuinely enjoy.
In simpler examples, if you pay close attention to what you’re feeling when you smoke, you might find out that the nicotine hit is not worth these mouth feelings, smoke it your lungs, instant slight headache, upcoming down-wave of tiredness. That requires attention and deep inspection, but that’s presumably what our real nature is.
Same way, if you closely inspect your interactions with other people, you might find out that “winning” them doesn’t feel good. And “helping” them sometimes doesn’t feel good either. The only thing that deeply, really, genuinely feels good is caring for them. You might still be incentivized to not do that; or you might find yourself in situation not possible to change. But when you look close enough, there is no uncertainty.
Obviously, on the one hand it only tells us that Homo Sapiens are the agents that have their base execution layer wired to help each other (see Qualia Computing on indirect realism). It makes total sense from evolutionary standpoint.
However, it also feels computationally easy to do that. It doesn’t feel like work to find “True Love”. It’s not always easy, but when you do this, it feels like a relief, like un-doing of work. Like dropping off the coat after coming home from rainy outside. Finally I get to be free and care about others.
Can this hint that there’s some dynamic that makes is easier to align? That in some specific sense, alignment and cooperation is universally easier than defection?
I am not saying this because I want it to be true. I don’t really believe computer can accidentally “wake up” to the “True Love”.
I am saying this because it might happen so that there’s some invariant at play that makes it easier to wish for low-entropy worlds, or to compute them, or something along these lines.
Finally, answering the original question. Yes, I consider myself fully aligned in the sense of my super-ego caring about each individuals’ subjective experience.
In my current state, I don’t always act on that, but wherever I catch myself in a tough choice, I try to apply the mechanism of “what’s that answer that is most obvious?”
P.S. Two caveats:
- Looks like this is a societal change to integrate this unwind-reactionary-behavior-Enlightenment into normal medical practice is even bigger than AGI alignment program.
  
  Even given we find a chemical that can trigger this change, people would most probably be very reluctant to normalize it (e.g. see MDMA-therapy only becoming socially acceptable around now). Most probably we would face the alignment problem faster than this, and after this it wouldn’t matter
- I might have just gone crazy from meditation and have started believing things that are not true. Subjectively, I feel there’s something to it that is very much worth exploring. But it might be similar to an LSD effect when you feel that “you’ve finally got it” but in reality you’re just drawing triangles inscribed in circles

Aleksey Bykhun

Introduction – Enlightenment?

What does it all have to do with the question?