I suspect that you may have it worse than average person, because you are intellectual worker and intellectual achievements tend to be unsatisfying, in a sense that when you discover something, you feel “eureka” moment and then you feel like you knew it forever. It’s even worse with intellectual process, because, in my experience, you forget all fruitless branches and backtracking of thought process almost like it was a dream.
quetzal_rainbow
Okay, I have instrumentally valuable Green strategies from personal experience.
When I was a little kid, my father taught me to swim. Being very anxious child, I panicked and tried to stay afloat, which resulted in me getting nose full of water, while the correct strategy was to relax and trust water to carry you because human body is floating object.
Another example is recent. I tried to understand textbook on model theory—like, I reread chapter multiple times and interrogated myself with сomprehension questions. I gave up, went to sleep and when I picked up textbook next day, my mind fluently arragened words into meaningful sentences.
(“Sleep on” something is an ultimate Green strategy, because it requires your passivity.)
Curiosity is instrumentally beneficial, but it can be directed towards different sorts of objects. Non-green curiosity is curiosity directed towards general objects. Thing like physics, math, evolutionary theory, game theory—things that are relevant everywhere. I’m bioinformatician and I love “general” part of biology, like evolution or main molecular pathways (replication, translation, electron transport chain, this sort of stuff). I struggled with less general parts of molecular biology, because any particular area of molecular biology for outsider looks like “lots of combinations of letters organized into complicated graphs”. I came to appreciation of “particular” part of biology through studying cybersecurity and realizing that vulnerabilities come from idiosyncratic properties of system, therefore, to discover them, you need curiosity about this particular system. This is also the way I learned to love immunology and oncology.
Good example of “Green curiosity” is demonstrated in novel “Annihilation” from perspective of biologist:
A species of mussels found nowhere else lived in those tidal pools, in a symbiotic relationship with a fish called a gartner, after its discoverer. Several species of marine snails and sea anemones lurked there, too, and a tough little squid I nicknamed Saint Pugnacious, eschewing its scientific name, because the danger music of its white-flashing luminescence made its mantle look like a pope’s hat.
I could easily lose hours there, observing the hidden life of tidal pool, and sometimes I marveled at the fact that I had been given such a gift: not just to lose myself in the present moment so utterly but also to have such solitude, which was all I had ever craved during my studies.
Even then, though, during the drives back, I was grieving the anticipated end of this happiness. Because I knew it had to end eventually. The research grant was only for two years, and who really would care about mussels longer than that…
Green curiosity (Duncan Sabien would say “Green-Black”) is helpful for what Gwern called unseeing:
...the opposite of a mathematician: a mathematician tries to ‘see through’ a complex system’s accidental complexity up to a simpler more-abstract more-true version which can be understood & manipulated—but for the hacker, all complexity is essential, and they are instead trying to unsee the simple abstract system down to the more-complex less-abstract (but also more true) version. (A mathematician might try to transform a program up into successively more abstract representations to eventually show it is trivially correct; a hacker would prefer to compile a program down into its most concrete representation to brute force all execution paths & find an exploit trivially proving it incorrect.)
Learning principles according to which evolutionary algorithm works is helped by non-green curiosity, but understanding particular weird product of evolutionary algorithm (even knowing that it is not optimal, even knowing that this understanding won’t benefit your generalized knowledge) requires Green curiosity.
Finally, I think Green curiosity is important as complement for terminal value of knowledge. I think after Singularity, if we survive, we are going to quickly discover all “general” stuff, like physical theory of everything, theory of builing multilevel world models, optimal nanotechnology for our physics, etc. I don’t think that at this point we will need to declare project of knowing thing finished, we will just need to learn to love studying particular cases.
Another Green epistemic strategies were briefly mentioned by Carlsmith himself, like careful and patient observation of reality without pushing your frames on it, or letting reality destroy your beliefs.
I think you’ve missed some Green strategies because you grew up in modern world, where good parts of environmentalism and laissez-faire are common sense and bad parts are widely known. But High Modernism gave as a lot of examples of catastrophies which happen when you are not Green enough, like central planning (Hayekian interpretation of markets is Green) or Chinese “Eliminate Sparrows Campaign”. Attitude “this forest is useless, we should cut it down and put here golf club” until recently was not trope for cartoon villians, but position of Very Serious People deciding in which direction world should develop.
Another angle is that Green is probably the youngest of attitudes of such type in broad memetic environment. In the past, on the one hand, people were in constant fight with environment, so if you were too Green, you died or were otherwise very miserable. On the other hand, you weren’t able to do much against environment, so you needed to lean on it as much as possible and if you tried to do something stupidly agentic like drinking potions of immortality made of mercury you died quickly too. You need very specific distance from Nature made out of industrial urban civilization to recognize necessity to relate to it somehow.
I agree with you that Deep Atheism is correct, but the problem is not with Deep Atheism, it’s with people who believe in it. It’s very easy to start considering the cold harsh world your enemy, while the world is merely neutral, and if the world is not your enemy, it can spontaneously organize into beneficial structures without your intervention and sometimes you should actively non-intervene to let this happen.
Of course, none of this is obligatory. Maybe you are just a sort of adequate agentic person which doesn’t require Green to learn everything mentioned.
Decomposing policy into values and value-neutral part seems to be hard and I suspect actual solution requires reflection.
It looks like the opposite? If 91% of progress happened due to two changes in scaling regimes, who knows what is going to happen after third.
I think you can get high-quality 100k smartphone which will be relevant for a long time, if you make it customized specificially for you. The problem is that customization is a labor from the side of the buyer and many people lack taste to perform such labor.
Chaitin’s Number of Wisdom. Knowledge looks like noise from outside.
I express this by saying “sufficiently advanced probabilistic reasoning is indistinguishable from prophetic intuition”.
I believe LW calls it “ethical injunctions”.
You start to appreciate all the points about “predicting technological progress is very hard actually” from “There is no fire alarm for AGI” when you realize that “Attention Is All You Need” was published several months earlier.
Another thing is narrow self-concept.
In original thread, people often write about things they have and their clone would want, like family. They fail to think about things they don’t have due to having families, like cocaine orgies, or volunteering to war for just cause, or monastic life in search of enlightenment, so they could flip a coin and go pursue alternative life in 50% of cases. I suspect it’s because thinking about desirable things you won’t have on the best available course of your life is very sour-grapes-flavored.
When I say “human values” without reference I mean “type of things that human-like mind can want and their extrapolations”. Like, blind from birth person can want their vision restored, even if they have sufficiently accommodating environment and other ways to orient, like echolocation. Able-bodied human can notice this and extrapolate this into possible new modalities of perception. You can be not vengeful person, but concept of revenge makes sense to almost any human, unlike concept of paperclip-maximization.
It’s nice ideal to strive for, but sometimes you need to make judgement call based on things you can’t explain.
Okay, but yumminess is not values. If we pick ML analogy, yumminess is reward signal or some other training hyperparameter.
My personal operationalization of values is “the thing that helps you to navigate trade-offs”. You can have yummi feelings about saving life of your son or about saving life of ten strangers, but we can’t say what you value until you consider situation where you need to choose between two. And, conversely, if you have good feelings about parties and reading books, your values direct what you choose.
Choice in case of real, value-laden trade-offs is usually defined by significant amount of reflection about values and memetic ambience supplies known summaries of such reflection in the past.
This reason only makes sense if you expect first person to develop AGI to create singleton which takes over the world and locks in pre-installed values, which, again, I find not very compatible with low p(doom). What prevents scenario “AGI developers look around for a year after creation of AGI and decide that they can do better” if not misaligned takeover and not suboptimal value lock-in?
The reason to work on preventing AI takeover now, as opposed to working on already invented AGI in the future, is the first try problem: if you have unaligned takeover-capable AGI, takover just happens and you don’t get to iterate. The same happens with problem of extremely good future only if you believe that the main surviving scenario is “aligned-with-developer-intention singleton takes over the world very quickly, locking in pre-installed values”. People who believe in such scenario usually have very high p(doom), so I assume you are not one of them.
What exactly prevents your strategy here from being “wait for aligned AGI, ask it how to make future extremely good and save some opportunity cost”?
Sure, set of available options is defined in problem setup. It’s “one-box” and “two-box”.
I feel like it’s confusion about type signature of decision theory? Decision theory talks about mappings from observations and probabilistic models to actions. In case of humans, actions are motor outputs. Decision theory asks “what sort of motor output is the best?” and answers “which leads you to leave with one box”. You are allowed to be really indecisive in process and cry “it feels wrong to leave the second box!”, Omega in this scenario doesn’t care.
I think the difference between reality and fiction is that fiction contains heroes—high-agency people in very personal relationships with villains.
Unrealistic part of being hero to designated villain is that in real life enemies are screened off from each other by many layers of social reality. Roosevelt was kinda hero to Hitler’s villainy, but Roosevelt mostly didn’t spend his nights thinking about Hitler could think, because 1) FDR was busy managing his country, 2) anything that Hitler could do was overconstrained by abilities of Germany, so it was more productive to think about what Germany as a whole could do. As a result, there is no interpersonal drama in which supervillain monologue fits.
Stanislaw Lem once wrote (I translated it from Russian, because I can’t find another source in the open):
...sometimes it seems incredible to me that there are so many billions of people, many thousands of whom are gifted with imagination and expertise in the field of hypothesis generation, but no one takes on what I do. How is it that in some remote corner of Slavic land, between the North Pole and the Balkans, one madman suffers in solitude? Even from my favored statistical point of view, it follows that there must be at least a few such Quasi-Lems, Anti-Lems, Para-Lems, or Proto-Lems in the world, but here there are none.
Alternative hypothesis: when you recontextualize, you train model “you will get away with X, given this signal (encouragement to do X in system prompt)”. When model sees no such signal, it decides that it won’t get away with bad behavior and acts good.
I wonder what would happen if you recontextualize with pair of prompts “your actions and thoughts will be recorded and evaluated” vs “you are acting absolutely autonomously, no oversight”.
Okay, but it’s one more reason not to read it: it doesn’t contain new things.