Born too late to explore Earth; born too early to explore the galaxy; born just the right time to save humanity.
Ulisse Mini
Biggest failure of the Rat community right now is neglecting emotional work, biggest upgrade to my rationality BY FAR (possibly more than reading the sequences even) has been in feeling all my emotions & letting them move through me till I’m back to clarity. This is feminine coded rationality imo (though for silly cultural reasons). AoA / Joe Hudson is the best resource on all this. He also works with Sama & OAI compute teams (lol).
A few concrete examples from my life.
When I fully feel my anger and let it move through me (apologies for woo terms!) I get back to clarity. My natural thoughts are correct, I don’t need to do galexy brained metacognition & self-correction to maintain the semblance of clear thinking like I used to
When I fully feel my shame & forgive/accept myself it becomes much easier for me to execute long-term self-improvement plans, where I tackle character flaws (e.g. lower conscientiousness than I’d like) with a bunch of 5% improvements, whereas previously I felt too much shame to “sit in the problem” for so long in a gradual improvement approach. Self-acceptance has made self-improvement stuff so much easier to think about clearly.
In general: Emotion biasing me → fully welcome the emotion → no longer biasing me, just integrated information/perspective! It also feels better and is a practice I can do. Highly recommend!
I doubt a more emotionally integrated rationalist community would fix the gender problem, but it would definitely help. I’ve heard girls I know call the Rat/EA cluster “inhumane” and IMO this is getting at something that repulses a lot of people, there’s a serious focus on head over emotions/heart/integrated-bodymind. Not as bad as Spock, but still pretty bad. Some lip service is paid to Focusing and Internal Double Crux (which are emotion-y kind of) but empirically most rats aren’t very emotionally well-integrated, there’s still a “Logical part” which is hammering down more “Irrational” parts, as opposed to working together. And this requires inner work! Not just reading replacing guilt once, for example.
All this relates to insecurity as well, it’s very hard to think rationally when you’re insecure. Preverbal thoughts and expectations will be warped at a deep level by emotional pieces trying to protect you. A lot can be done about that though, Chris is the main pioneer in the emotional security space IMO. Though the AoA/Joe Hudson stuff helps a ton too. All paths to the same goal.
I really don’t like how this post blends supernatural, fictional elements with the practical. The caveats about how wizard power in reality isn’t like wizard power in stories are good, but not sufficient, the actively misleading term continues to warp people’s cognition.
For example, it wasn’t mentioned how technology (I’m not going to call it wizard power) generally requires a lot of coordination and capital (“king power”) to get working, and produce at reasonable price. Magic is sexy and cool because you’re “doing it yourself” whereas technology is a large team effort.
John seems to be warped by this effect ^ notice how he talks about DIY ~entirely in terms of doing stuff alone instead of in large groups, because that’s sexier if you’re an individualist & distrust all large groups. You would not come up with “making your own toothbrush” as something that’s “wizard power” without these cognitive distortions (individualism + magical thinking).
But really my main problem with this isn’t that it lacks some caveats, it’s the general pattern of Rats actively distancing themselves from reality, often in a way with undercurrents of “our thoughts here are special and our community is the best”. I know this isn’t enough to point out the feeling I get for those who don’t share it. It’s hard to see when you’re in the community, but after leaving looking back the distortions are extremely obvious. I might write more about this at some point, or maybe not.
I like this sentiment:
Forget RadVac. I wish for the sort of community which could produce its own COVID vaccine in March 2020, and have a 100-person challenge trial done by the end of April.
I wish there was more action and clear thinking in the rat community, without weird cognitive distortions that are hard to see except from outside.
You decide what is a win or not. If you’re spiraling give yourself wins for getting out of bed, going outside, etc. Morale compounds and you’ll get out of it. This is the biggest thing to do imo. Lower your “standards” temporarily. What we reward ourselves for is a tool to be productive, not an objective measure for how much we did that needs to stay fixed.
I think asking people like Daniel Ingram, Frank Yang, Nick Cammeratta, Shinzen Young, Roger Thisdell, etc. on how they experience pain post awakening is much more productive than debating 2500 year old teachings which have been (mis)translated many times.
[Question] What rationality failure modes are there?
Answering my own question, a list of theories I have yet to study that may yield significant insight:
Theory of Heavy-Tailed Self-Regularization (https://weightwatcher.ai/)
Singular learning theory
Neural tangent kernels et. al. (deep learning theory book)
Information theory of deep learning
[Question] What ML gears do you like?
Paper: Understanding and Controlling a Maze-Solving Policy Network
ActAdd: Steering Language Models without Optimization
Open problems in activation engineering
I wasn’t in a flaming asshole mood, it was a deliberate choice. I think being mean is necessary to accurately communicate vibes & feelings here, I could serialize stuff as “I’m feeling XYZ and think this makes people feel ABC” but this level of serialization won’t activate people’s mirror neurons & have them actually internalize anything.
Unsure if this worked, it definitely increased controversy & engagement but that wasn’t my goal. The goal was to shock one or two people out of bad patterns.
Sorry, I was more criticizing a pattern I see in the community rather than you specifically
However, basically everyone I know who takes innate intelligence as “real and important” is dumber for it. It is very liable to mode collapse into fixed mindsets, and I’ve seen this (imo) happen a lot in the rat community.
(When trying to criticize a vibe / communicate a feeling it’s more easily done with extreme language, serializing loses information. sorry.)
EDIT: I think this comment was overly harsh, leaving it below for reference. The harsh tone was contributed from being slightly burnt out from feeling like many people in EA were viewing me as their potential ender wiggin, and internalizing it.[1]
The people who suggest schemes like what I’m criticizing are all great people who are genuinely trying to help, and likely are.
Sometimes being a child in the machine can be hard though, and while I think I was ~mature and emotionally robust enough to take the world on my shoulders, many others (including adults) aren’t.
An entire school system (or at least an entire network of universities, with university-level funding) focused on Sequences-style rationality in general and AI alignment in particular.
[...]
Genetic engineering, focused-training-from-a-young-age, or other extreme “talent development” setups.
Please stop being a fucking coward speculating on the internet about how child soldiers could solve your problems for you. Enders game is fiction, it would not work in reality, and that isn’t even considering the negative effects on the kids. You aren’t smart enough for galaxy brained plans like this to cause anything other than disaster.
In general rationalists need to get over their fetish for innate intelligence and actually do something instead of making excuses all day. I’ve mingled with good alignment researchers, they aren’t supergeniuses, but they did actually try.
(This whole comment applies to Rationalists generally, not just the OP.)
- ↩︎
I should clarify this mostly wasn’t stuff the atlas program contributed to. Most of the damage was done from my personality + heroic responsibility in rat fiction + dark arts of rationality + death with dignity post. Nor did atlas staff do much to extenuate this, seeing myself as one of the best they could find was most of it, cementing the deep “no one will save you or those you love” feeling.
- ↩︎
Excited to see what comes out of this. I do want to raise attention to this failure mode covered in the sequences. however. I’d love for those who do the program try to bind their results to reality in some way, ideally having a concrete result of how they’re substantively stronger afterwards, and how this replicated with other participants who did the training.
Really nice post. One thing I’m curious about is this line:
This provides some intuitions about what sort of predictor you’d need to get a non-delusional agent—for instance, it should be possible if you simulate the agent’s entire boundary.
I don’t see the connection here? Haven’t read the paper though.
Quick thoughts on creating a anti-human chess engine.
Use maiachess to get a probability distribution over opponent moves based on their ELO. for extra credit fine-tune on that specific player’s past games.
Compute expectiminimax search over maia predictions. Bottom out with stockfish value when going deeper becomes impractical. (For MVP bottom out with stockfish after a couple ply, no need to be fancy.) Also note: We want to maximize (P(win)) not centipawn advantage.
For extra credit, tune hyperparameters via self-play against maia (simulated human). Use lichess players as a validation set.
???
Profit.
This might actually be a case where a chess GM would outperform an AI: they can think psychologically, so they can deliberately pick traps and positions that they know I would have difficulty with.
Emphasis needed. I expect a GM to beat you down a rook every time, and down a queen most times.
Stockfish assumes you will make optimal moves in planning and so plays defensive when down pieces, but an AI optimized to trick humans (i.e. allowing suboptimal play when humans are likely to make a mistake) would do far better. You could probably build this with maiachess, I recall seeing someone build something like this though I can’t find the link right now.
Put another way, all the experiments you do are making a significant type error, Stockfish down a rook against a worse opponent does not play remotely like a GM down a rook. I would lose to a GM every time, I would beat Stockfish most times.
- 11 Jul 2023 20:23 UTC; 2 points) 's comment on Ulisse Mini’s Shortform by (
[ASoT] GPT2 Steering & The Tuned Lens
Haha nice work! I’m impressed you got TransformerLens working on Colab, I underestimated how much CPU ram they had. I would have shared a link to my notebook & Colab but figured it might be good to keep under the radar so people could preregister predictions.
Maybe the knowledge that you’re hot on my heels will make me finish the LLAMAs post faster now ;)
EDIT: it’s also possible John felt fine emotionally and was fully aware of his emotional state and actually was so good at not latching on to emotions that it was highly nontrivial to spot, or some combination. Leaving this comment in case it’s useful for others. I don’t like the tone though, I might’ve been very disassociated as a rationalist (and many are) but it’s not obvious John is from this alone or not.
As a meditator I pay a lot of attention to what emotion I’m feeling in high resolution and the causality between it and my thoughts and actions. I highly recommend this practice. What John describes in “plan predictor predicts failure” is something I notice several times a month & address. It’s 101 stuff when you’re orienting at it from the emotional angle, there’s also a variety of practices I can deploy (feeling emotions, jhanas, many hard to describe mental motions...) to get back to equilibrium and clear thinking & action. This has overall been a bigger update to my effectiveness than the sequences, plausibly my rationality too (I can finally be unbiased instead of trying to correct or pretend I’m not biased!)
Like, when I head you say “your instinctive plan-evaluator may end up with a global negative bias” I’m like, hm, why not just say “if you notice everything feels subtly heavier and like the world has metaphorically lost color” (how I notice it in myself. tbc fully nonverbally). Noticing through patterns of verbal thought also works, but it’s just less data to do metacognition over. You’re noticing correlations and inferring the territory (how you feel) instead of paying attention to how you feel directly (something which can be learned over time by directing attention towards noticing, not instantly)
I may write on this. Till then I highly recommend Joe Hudson’s work, it may require a small amount of woo tolerance, but only small. He coached Sam Altman & other top execs on emotional clarity & fluidity. Extremely good. Requires some practice & willingness to embrace emotional intensity (sometimes locally painful) though.