AI Safety person currently working on multi-agent coordination problems.
Jonas Hallgren
Nor is this process about reality (as many delusional Buddhists seem to insist), but more like choosing to run a different OS on ones hardware.
(I kind of wanted to give some nuance on the reality part from the OS Swapping perspective. You’re of course right with some overzealous people believing they’ve found god and similar but I think there’s more nuance here)
If we instead take your perspective of OS swap I would say it is a bit like switching from Windows to Linux because you get less bloatware. To be more precise one of the main parts of the swap is the lessening of the entrenchments of your existing priors. It’s gonna take you a while to set up a good distro but you will be less deluded as a consequence and also closer to “reality” if reality is the ability to see what happens with the underlying bits in the system. As a consequence you can choose from more models and you start interpreting things more in real time and thus you’re closer to reality, what is happening now rather than the story of your last 5 years.
Finally on the pain of the swap, there are also more gradual forms of this, you can try out Ubuntu (mindfulness, loving kindness) before switching over. Seeing through your existing stories can happen in degrees, you don’t have to become enlightened to enjoy the benefits?
Also, I think that terminology can lead to specific induced states as it primes your mind for certain things.
One of the annoying things with meditation is of course that there’s n=1 primary experience that makes it hard to talk about yet from my perspective it seems a bit like insight cycling, dark night of the soul and the hell realms are something that can be related to a hyperstition or a specific way of practicing?
If you for example follow thai-forest tradition, mahamudra or dzogchen (potentially advaita though less certain) it seems that insights along those lines are more a consequence of not having established a strong enough 1 to 1 correspondence with loving awareness before doing intense concentration meditation? (Experience has always been happening, yet the basis for that experience might be different.)
It is a bit like the difference between dissolving into a warm open bath or a warm embrace or hug of the world versus seeing through the world to an abyss where there is no ground. That groundlessness seems to be shaped by what is there to meet it and so I’m a bit worried about the temporal cycling language as it seems to predicate a path on what has no ground?
I don’t really have a good solution here as people seem to be going through those sort of experiences that you’re talking about and it isn’t like I’ve not gotten depressive episodes after longer meditation epxperiences either. Yet I don’t know if I would call it a dark night of the soul for it implies a necessity of personation with the suffering and that is not what is primary? Language is a prior for experience and so I would just use different language myself but whatever.
Man I’m noticing this is hard to put into words, hopefully some of it made sense and I appreciate the effort for a more standardised cybernetic basis to talk about these things through.
dissolution of desire. An altered trait where your brain’s reinforcement learning algorithm is no longer abstracted into desire-as-suffering.
Would you analogize this term to the insights of “dukkha”? I find an important thing here to be the equal taste of joy and sorrow from the perspective of dukkha and so it might be worth emphasising? (maybe I’m off with that though.)
Here’s an extension of what you said in terms of dullness and sharpness within attention based practices. (Partly to check that I understand)
Dullness = subcriticality and distance in cascading below the criticality line
Monkey mind = supercriticality and cascading above the criticality line (activates for whatever shows up)
If we look at the 10 stages of TMI (9-stage Elephant path), the progression goes something like distracted mind → subcriticality (stage 2-3) → practices to increase cascading of brain (4-5) → practices for the attention to calibrate around the criticality line (6-10)
Also this is why the tip to meet your meditation freshly wherever it is appearing is important because it is a criticality tuning process that is different for everyone?
(I very much like this way of thinking about this, nice!)
Intuition Pump: The AI Society
Based on a true map of the territory. (I really like this advice a good exploration strat seems similar to the one about taking photographies, it is really just about taking a bunch of them and you’ll learn what works over time.)
I really appreciated this post.
I didn’t know that you had concepts for aliveness and boggling within the rationality sphere as I find these two of my most previous states that I’ve been cultivating over the last couple of years and they’ve always felt semi-orthogonal to more classic rationality (which I associate more with the betting, TDT and deep empiricism stuff).
Meditation seems to bring aliveness, boggling and focusing forth quite well and I just really appreciate that they’re things you place high value on as I find them some of the best ways of getting out of pre-existent frames. (Which for me seems one of the best ways of becoming more rational)
On character alignment for LLMs.
I would like to propose that we think of a John Rawls style original position (https://en.wikipedia.org/wiki/Original_position) as one view when looking at character prompting for LLMs. More specifically I would want you to imagine that you’re on a social network or similar and that you’re put into a word with mixtures of AI and human systems, how do you program the AI in order to make the situation optimal? You’re a random person among all of the people, this means that some AIs are aligned to you some are not. Most likely, the majority of AIs will be run by larger corporations since the amount of AIs will be proportional to the power you have.
How would you prompt each LLM agent? What are their important characteristics? What happens if they’re thought of as “tool-aligned”?
If we’re getting more internet based over time and AI systems are more human in that they can flawlessly pass the turing test, I think the veil of ignorance style thinking becomes more and more applicable.
Think more of how you would design a societly of LLMs and what if the entire society of LLMs had this alignment rather than just the individual LLM.
This is a nice way to get around the problems raised in Andrew Critch’s post on consciousness as well since it is a lot less conflationary
Adults will pre-mortem plan by thinking most of their plans will fail. They will therefore they have a dozen layers of backups and action plans prepared in advance. This is also so that other people can feel that they’re safe because someone had already planned this. (“Yeah, I knew this would happen” type of vibe.)
The question would then be “How will my first 10 layers of plans go wrong and how can I take this into account?”
A quick example of this might look like this:
Race dynamics will happen, therefore we need controls and coordination (90%)
Coordination will break down therefore we need better systems (90%)
We will find these better systems yet they will not be implemented by default (85%)
We will therefore have to develop new coordination systems yet they can’t be too big since they won’t be implemented (80%)
We should therefore work on developing proven coordination systems where they empirically work and then scale them
(Other plan maybe around tracking compute or other governance measures)
And I’ve now coincidentally arrived at the same place as Audrey Tang...
But we need more layers than this because systems will fail in various more ways as well!
(Warning: relatively hot take and also not medical advice.)
Firstly, there’s a major underlying confounder effect here which is the untracked severity of insomnia and it’s correlation with the prescription of melatonin. If these are majorly coupled it could amount to most of the effect?
Secondly, here’s a model and a tip for melatonin use as most US over the top pills I’ve seen are around 10mg which is way too much. I’ll start with the basic mental model for why and then say the underlying thing we see.
TL;DR:
If you want to be on the safe side don’t take more than 3mg of it per night. (You’re probably gonna be fine anyway due to the confounder effects of long-term insomnia having higher correlation with long-term melatonin use but who knows how that trend actually looks like.)
Model:
There’s a model for sleep which I quite like from Uri Alon (I think it’s from him at least) and it is mainly as the circadian rhythm and sleep mechanism as a base layer signal for your other bodily systems to sync to.
The reasoning goes a bit like: Which is the most stable cycle that we could stabilise to? Well we have a day rhythm that is coupled to 24 hours a day each day, very stable compared to most other signals. That’s the circadian rhythm which is maintained by your system’s sleep.
What sleep does is that it is a reset period for the biological version of this as it sends out a bunch of low-range stable signals that are saying “hey gather around let’s coordinate, the time is approximately night time.” These brain signals don’t happen in non-sleep and so they are easy to check.
Melatonin is one of the main molecules for regulating this pattern and you actually don’t need more than 0.3 mg (remember bioavaliability here) to double your existing amount that you already have in the body. Most over the counter medicine is around 10mg which is way too much. Imagine that you change one of the baseline signals of your base regulatory system and just bloop it the fuck out of the stratosphere for a concentrated period once everyday. The half life of melatonin is also something like 25-50 minutes so it decays pretty quickly as well which means that the curve ends up looking like the following:
If you don’t do this then your more natural curve looks something more like this:
Section 3 of the following talk about a desentisation of the MT2 part of melatonin as something that happens quite quickly: https://www.sciencedirect.com/topics/neuroscience/melatonin-receptor
So if you stratosphere your melatonin with 10 mg then your REM sleep will be fine but the sensitivity to your MT2 receptor will be a bit fucked which means worse deep sleep (which is important for your cardiovascular system!). Hence you will fall asleep but with worse deep sleep (Haven’t checked if this is true but this could probably be checked pretty easily experimentally).
The bioavaliability of melatonin varies between 10 and 30% so if you aim for approximately your own intragenous generation of melatonin you should take 10 to 3x the amount existent in the system. For my own optimal sleep that is 0.5 mg of melatonin but that’s because my own system already works pretty well and I just need a smaller nudge. The area under the curve part of the model is also a good reason to take slow release melatonin as it better approximates your normal melatonin curve.
(I need to get back to work lol but hopefully this helps a bit)
Also, what an amazing post. You’ve expressed something that I’ve wanted to express for a while on a level of depth that I wouldn’t have been able to do and I got literal chills when reading the last part. Well done.
Theoretical physics could produce a recipe for ruin if it turns out that there’s a way to crash the universe through something like a buffer overflow. I’ve seen enough arbitrary code execution glitches to intuitively understand that there’s nothing in principle that says we can’t find a way to bug out some underlying level of abstraction in our physics and break everything. Interestingly enough a bug of this type would complicate interstellar expansion because it would mean that the potential value drift as you exit communication range with other parts of your civilization could lead to a series of events that destroy everything for everyone. Knowledge of such a bug could therefore be one explanation for the Fermi Paradox.
On the 4D Chess takes section I think if you combine that idea with this idea of an ever evolving cosmology where we have selection for the proliferation of intelligent life then it does make sense that we would also have some unintentional bugs come up: https://www.youtube.com/watch?v=7OCY9ppY34Q&lc=UgxFQvrJXcgA6U1Xvf54AaABAg
(The TL;DR of the video is that our cosmos could quite nicely be explained through evolution through replicators that are black holes basically. It’s quite cool and from this year’s ILIAD.)
I also just want to point out that there should be a base rate here that’s higher context in the beginning since before MATS and similar there weren’t really that many AI Safety training programs.
So the intiial people that you get will automatically be higher context because the sample is taken from people who have already worked on it/learnt about it for a while. This should go down over time due to the higher context individuals being taken in?
(I don’t know how large this effect would be but I would just want to point it out.)
I liked this book too and I just wanted to share a graphic that was implied in the book between guidance and expertise. It’s a pretty obvious idea but for me it was just one of those things you don’t think about. The lower context someone has the more guidance they need and vice versa (the trend is not necessarily linear though):
Okay (if possible), I want you to imagine I’m an AI system or similar and that you can give me resources in the context window that increase the probability of me making progress on problems you care about in the next 5 years. Do you have a reading list or similar for this sort of thing? (It seems hard to specify and so it might be easier to mention what resources can bring the ideas forth. I also recognize that this might be one of those applied knowledge things rather than a set of knowledge things.)
Also, if we take the cryptography lens seriously here, an implication might be that I should learn the existing off the shelf solutions in order to “not invent my own”. I do believe that there is no such thing as being truly agnostic to a meta-philosophy since you’re somehow implicitly projecting your own biases on to the world.
I’m gonna make this personally applicable to myself as that feels more skin in the game and less like a general exercise.
There are a couple of contexts to draw from here:
Traditional philosophy. (I’ve read the following):
A history of western philosophy
(Plato, Aristotles, Hume, Foucault, Spinoza, Russell, John Rawls, Dennett, Chalmers and a bunch of other non continental philosophers)
Eastern philosophy (I’ve read the following):
Buddhism, Daoism, Confucianism (Mainly tibetan buddhism here)
Modern more AI related philosophy (I’ve read the following):
Yudkowsky, Bostrom
(Not at first glance philosophy but): Michael Levin (Diverse Intelligence), Some Category Theory (Composition),
Which one is the one to double down on? How do they relate to learning more about meta ethics? Where am I missing things within my philosophy education?
(I’m not sure this is a productive road to go down but I would love to learn more about how to learn more about this.)
Okay, this is quite interesting. I’ll try to parse this by mentioning a potential instantiation of this and maybe you could let me know if I got it wrong or right/somewhere in between?
The scenario is that I’m trying to figure out what I should do when I wake up in the morning and I’m on a walk. What do I is that I then listen to the world in some way, I try to figure out some good way to take actions. One way to do this is somato sensory experiencing, I listen into my sub-parts. Yet a problem with this is that there’s a egree o fpassivity here. Yes my stomach is saying go away and hide and my shoulders are saying that I’m carrying weight whilst maybe my face is smiling and is curious. This listening has some sort of lack of integration within it? I now know this but that doesn’t mean that my sub-parts have had a good conversation.
We can extend this even further for why is the best basis something like emotions? Why can’t we sense things like a degree of extended cognition within our social circles and with different things that we do?
The practice is then somehow figure out how to listen and do good bargaining and to come up with good solutions for the combined agency that you have, whatever that might be? And the more extended and open you can make that cognition, the better it is?
Yet, you shouldn’t fully identify with your social network or the world, neither should you identify with nothing, you should identify with something in between (non-duality from buddhism?). You should try to find the most (causally?) relevant actor to identify with and this is situation dependent and an art?
So that is the process to engage in and the answer is different for each person? (Let me know if I’m off the mark or if this is kind of what you mean)
n=1 evidence but I thought I would share from a random external perspective who enjoys doing some writing and who’s in the potential audience for a future version (if run) of inkhaven.
For this version, I had no clue how it would be and so I thought it was too high risk to gamble on it being good. Given what I’ve seen of the setup this year I would basically be a guaranteed sign up if it was less than 15-20% of my cash reserve to go next year. Potentially upwards of 30-40% (reference class: currently doing more or less independent work and getting by with money for going to uni from the swedish state).
(For some reason I feel a bit weird about making this comment but I also want to practice sending unfinished comments that are more random feedback as it is often useful for the individuals involved given the assumption that my perspective is somewhat indicative of a larger audience.)
Could you elaborate here?
Is there a specific example of the difference between just somatic sensing and having it being intuitively reflected through your thoughts? I feel like you’re saying something important but I’m not fully sure how it manifests more concretely and I might want to work on this skill.
I do feel like there’s something to be said for a more integrated emotion system not being as somatic but being more implicit in the system, like your thoughts and feelings are more one-pointed which is kind of where my experience has shifted to over time, I don’t know if this is what you mean?
TL;DR
I guess the question I’m trying to ask is: What do you think the role of simulation and computation is for this field?
Longer:
Okay, this might be a stupid thought but one could consider MARL environments and for example https://github.com/metta-AI/metta (softmax) to be a sort of generator function of these sorts of reward functions potentially?
Something something it is easier to program constraints into how the reward function and have gradient descent discover it than it is to fully generate it from scratch.
I think there’s mainly a lot of theory work that’s needed here but there might be something to be said about having a simulation part as well where you do some sort of combinatorial search for good reward functions?
(Yes, the thought that it will solve itself if we just bring it in to a cooperative or similar MARL scenario and then do IRL on that is naive but I think it might be an interesting strategy if we think about it as combinatorial search problem that needs to satisfy certain requirements?)