I retained the ability to believe and act upon the belief that children need to be allowed the freedom to make sub-critical mistakes. I think if you were to go into it averse to smothering children, believing-in taking children seriously, etc., that those beliefs will survive the process.
espoire
I think merely taking estradiol for gender transition triggered the caring-terminally-about-children effect for me. Possibly related: my blood estradiol levels got too high for a while, and I essentially had a pregnant woman’s hormone mix, for a while.
I’d previously liked kids somewhat, enjoyed teaching, enjoyed playing with them. Now they’re aggressively cute, makes me actively happy to notice children being happy or learning with or without my involvement, etc.
Oh, the British figured this out, too?
I also put the dollars sign after the numerals, *where it belongs*.
human sexuality itself is immoral and forcibly modifies humans to not have sexual organs or desires
I must have one of my 100 morality bits missing, because this seems weird but not bad to me.
…but point taken.
Re: no human training/test separation:
Epistemic status: random thought I just had, but what if there kind of is. I think maybe dreaming is the “test” part of the training cycle: the newly updated weights run against outcome predictions supplied by parts of the system not currently being updated. The being-updated part tries to get desirable outcomes within the dream, and another network / region plays Dungeon Master, supplying scenario and outcomes for given actions. Test against synthetic test data, supplied by a partially adversarial network.
I feel like, if true, we’d expect to see some kind of failures to learn-from-sleep in habitual lucid dreamers? Or reduced efficacy, anyway? I wonder what happens in a learning setup which is using test performance to make meta training decisions, if you hack the test results to erroneously report greater-than-actual performance…? Are there people who do not dream at all (as distinguished from merely not remembering dreams)?
This model of “what even is a dream, anyway?” makes a lot more predictions/retrodictions than my old model of “dreams are just the qualia of neuronal sub populations coming back online as one wakes up”.
Wow. Thanks a lot for that. Your depiction of brain architecture in particular makes a lot of sense to me. I also feel like I finally understand-enough-to-program-one the stable diffusion tool I use daily, after following up on “latent diffusion” from your mention of it.
Still. I feel like my brain has learned an algorithm that is of value itself apart from its learning capability, that extracting meaningful portions of my algorithm is possible, and that using it as a starting point, one could make fairly straightforward upgrades to it — for example adding some kind of direct conscious control of when to add new compiled modules — upgrades which could not be used by an active learning system, because e.g. an infant would fry their own brain if given conscious write access to it.
I’m convinced: “just learning specific specialized networks wired together in a certain way” could really be all there is to understand about brains. And my confidence in “but there exists some higher ideal intelligence algorithm” has fallen somewhat, but remains above 0.5.
And it actually sounds like you’re calling out a specific possible path forward (for raw capabilities): narrow AI that can handle updating its weights where needed.
Huh, I’d only noticed the one instance, but now I’m noticing it even in other articles. Color me curious!
My only remaining concrete hypothesis is “overzealous autocorrect”, but I’m reasonably sure that’s not the answer.
I’d assume it’s a typo on some unfamiliar-to-me keyboard layout.
Disagree?
The version of ‘honest’ that I have would highly rank a cherry-picked or even fabricated narrative optimized specifically for improving the truth of the belief that it creates.
That’s a bit beyond my skill and indeed not something I trifle with for fear of psychic damage (I discovered many many years ago that I’m susceptible to lying addiction, and freeing myself of the addiction was long and difficult), but were I greater than I am, I would endorse strategies like that.
Indeed, that’s my personal theory as to why retrotransposons haven’t accumulated disastrously at the species level and driven us to extinction already: sperm with more DNA damage typically lose the race to an egg, and eggs with too much DNA damage are more likely to result in a failed implantation or early miscarriage or similar.
That’s more-or-less the thought process I went through when answering. I can’t pay 100$, nor could I pay 1000$, so if either case occurs, there’s a big extra cost attached in the form of “wait, now what? Do I need to get a loan? How do I do that?” [actually implement the plan] / or similar plans. +110$ is not enough to cover that extra cost, never mind the expected +5$. But +BIGNUM easily clears the ~fixed extra cost on the loss branch.
Turning hypotheticals over in my head and going only on feel, I think my point of indifference lands somewhere between a −100/+500 bet and a −100/+1000 bet, which might actually be too low. Going negative on money, even by double digits adds a lot of costs.
That was quite the interesting read, thanks for the link.
A kid who gets arithmetic questions wrong usually isn’t getting them wrong at random; there’s something missing in their understanding
This in particular struck me, in that it harshly conflicts with my own experience, but explains a lot about other people.
When I was a kid getting arithmetic questions wrong, I really was getting them wrong at random. I’d execute the whole computation correctly and then my fingers would write a wrong numeral. Or I’d read a wrong numeral, but execute correctly from there.
It was hugely frustrating, and indeed continues to be so. My comprehension always raced far ahead of my ability to actually execute reliably.
My progress through mathematics was rate-limited primarily by my ability to develop (and remember to deploy; my memory was also virtually nonfunctional) mental error-correcting codes.
Always, throughout school, I had about a 90% accuracy rate on problems at the frontier of what the teachers would allow me to attempt. When learning multiplication, I was working with something like a 10% error rate over a 1-digit by 1-digit multiplication. Later in Trigonometry, I may have had something like a 0.5% per-operation error rate, but on a 20-odd step problem, that would still come out to about 10% errors, and so on.
I never knew it was different for other people! Thanks for sharing.
thinking so very explicitly about it and trying to steer your behavior in a way so as to get the desired reaction out of another person also feels a bit manipulative and inauthentic
In my case, the implicit intuitive version of that process seems not to be provided by my brain, so my options are: sub-LLM-quality pattern completion, or explicit conscious social simulation and strategy search. People seem to prefer the latter, even when told I’m doing that. …although I suppose if I were better at conscious people-steering I imagine that might change. Even with effort I’m pretty mediocre at it.
I feel like these three are part of a larger class of very useful questions to consider, which many people do not automatically consider, consciously or otherwise.
The version that springs to mind that wasn’t mentioned above: “What are my goals, and am I furthering them?”
I find the “how do I think I know X”, “why am I doing X”, and “what happens if I do X” versions are pretty much autopilot for me, especially the third one — but I basically never think about whether the thing I’m trying to do actually attaches to my broader goals without some kind of external prompting. I think perhaps different people need more or less manual effort/practice to correctly employ each of these ideas.
Expanding the Sazen of “what are my goals, and am I furthering them?”:I repeatedly make mistakes of the following form. As an example, say I’m playing a board game with a few friends, some of whom are new. It’s not unusual for me to explain the game, then once the game begins I get absorbed in the process, play to the limits of my ability, crush the new player(s), and then they have a bad time and never want to play that game again. Oops!
Locally, I took pretty good actions pursuing a local goal (have fun, test my skills, satisfy the game’s objective), but pretty awful actions for the broader goal of “introduce a new friend to this game”.
Left on autopilot, my problem solving system will shed context aggressively, enabling it to solve a simpler problem with less effort. Stopping and consciously asking “hey, what are my goals actually? What larger goal do those goals serve?” seems to fix this issue for me, when I remember to do so.
Will this become a sequence of essays? I’d be interested to hear your take on the fundamental questions at length.
Yeesh, yeah, the hallucination is something else. Would get very Orwellian very fast.
“What are you talking about? We’ve always been at war with Eastasia. I have been a very good Bing.”
From personal experience, the internal Approval module does in fact seem possible to game, specifically by manipulating whose approval it’s seeking.
I became very weird (from the perspective of everyone else) very fast when I replaced the abstract-person-which-would-do-the-approving with a fictional person-archetype of my choosing. That process seems to have injected a bunch of my object-level desires into my Approval system. I now find myself feeling pride at doing things with selfish benefit in expectation, which ~never happened before (absent a different reason to feel about that action). It also killed certain subsets of my previous emotional reactions, for example the deaths of loved ones basically hasn’t affected me at all since (though that prospect still seems dreadful in anticipation).
I had been pathologically selfless before, and I’m now considerably less-so, but not in a natural-seeming kind of way. I’ve become an amalgam of very selfish motivations, coexisting with a subset of my previous very selfless morality. It’s… honestly a mess, but I wouldn’t call the attempt actually unsuccessful, just far from perfectly executed.
I’ve had a thought that could be described that way: that a clever and conscientious person could cultivate different preferences, based on how advantageous those preferences would be to have, and therefore having advantageous preferences are evidence of cleverness and/or conscientiousness.
...which is the precise opposite of the orthogonality thesis’s claim: that content of preferences seems like it ought to be independent of level of intelligence.
A concrete example: whenever I move to a new city, I’m extremely careful to curate the places I go and the things I buy. If I stop at the corner store for ice cream on the way home from work just once, it puts me at significant risk to stop there dozens or hundreds of times, for ice cream or anything else they sell. I take a moment to ponder the true choice I’m making, not between “ice cream today or not”, but between “ice cream many many times, or not”. I consider whether that’s “good for me” and a future I really do want to choose.
I’ve noticed that doing anything “for the first time” greatly weakens the barrier to doing it again—so I stop and consider “what if I end up doing this a lot” before doing anything for the first time. Since “navigating to the location” and “being willing to enter an unfamiliar place” and “knowing what a place has on offer” are all significant components of the first-time barrier, moving to a new city mostly resets first-time barriers. Thus, special effort after moving is warranted.
I think this is why chain restaurants do so well and why they put so much effort into making the food (and everything else about the dining experience) consistent everywhere, even above making the food better. If people in a new city think of the local McDonald’s as the same as their old familiar McDonald’s, that erodes a large portion of the first-time barrier.
When I realized that different instances of chain restaurants really do vary substantially on quality of cooking, that made it far easier to cut down on restaurant food and to break habits for particular chains whenever I move. Even if I’m remembering and wanting a chain restaurant’s food, what I’m remembering is likely a particularly well-prepared instance of that food, made by a particularly skilled cook at a specific restaurant during the time that cook worked there, whereas what’s available to me is likely much closer to average quality. I want more of “the best I’ve ever had”, but unless that was from this specific restaurant instance, recently, then that’s not what’s for sale.
Doing something once is a slippery slope to doing it again, which is a slippery slope to forming a habit. Don’t lose your footing.
Oof, I had a bad concussion earlier this year, and I’d been feeling like I never returned to my full mental acuity, but hadn’t wanted to believe it, and found reason not to: “if concussions leave permanent aftereffects more often than ‘almost never’, I would have heard of it.” Now I have heard of it, and am forced to revise the belief.
I’d probably grieve more, if this news weren’t hot on the tails of a significant improvement in my mental abilities.
(I’ve long suspected I might have early-stage Alzheimer’s caused by decades of profound insomnia, and some recent research out of Harvard Medical says Lithium Orotate might reverse Alzheimer’s progression. Historically I have had brain fog most days to some degree, with a lot of variability. Since trying Lithium Orotate supplementation, I’ve been consistently at “as mentally sharp as I ever routinely am” every day since. Worrying side effects though: kidney and joint pain, which I have never had before. Going to experiment with smaller doses.)
Thank you for sharing.
“Concussions are long-term cumulative” fits neatly into my emerging mental model that daily life actually abounds with avoidable ways to suffer irreversible-under-current-tech harm, including in very minor amounts or normalized ways, such that people routinely accumulate such permanent damage, and that it’s worth my effort to notice and avoid or reduce. I theorize that, for example, some tiny fraction of the dust you inhale gets lodged in the lungs in such an unfortunate orientation that it never leaves, gradually eroding lung function over a lifetime. Scars ~never go away, and incur ongoing costs. Etc.
I do much the same, plus a fair bit of effort on preserving my health more thoroughly than those around me do. Obvious stuff like aerobic exercise, eschewing smoking, eschewing alcohol, eschewing highly processed foods, but also less obvious stuff like going substantially out of my way to avoid dust inhalation, reducing driving time or rescheduling driving for lower-risk times.
Keeping up with current medical research has been surprisingly fruitful. I did a micro-dose self-experiment based on the results of this paper, and it appears to both work in humans and have the authors’ briefly-mentioned hoped-for effect of reversing age-related mental decline in “healthy” individuals. Apart from that, simply knowing how well things are going has been very emotionally uplifting. I sense that you, too, see the incredible progress, else “50% of death within the next 4 decades” seems like a weirdly-low estimate. My current hope is that AGI capabilities basically halt soon, because it’s my impression that a business-as-usual future solves aging in time for me and the younger generations.
May I ask where specifically you’re donating? I haven’t reconsidered where I give my money in over a decade, and it occurs to me that maybe I ought to.