There is also the universal Girardian mimetic failure mode. I don’t know if summarizing the whole theory here makes sense—but it posits that humans want what other humans want, so there is a spiral of ever increasing desire for things and status. I once wrote an essay on that in the context of internet discussions: https://blog.p2pfoundation.net/online-conflict-in-the-light-of-mimetic-theory/2009/11/25
Another failure mode: the replication crisis in science—where only new and surprising theses are being published, but there is no mechanism for reinforcing existing theories. This also happens in social media—people always want to learn new things. And probably more generally all the other things from https://www.gwern.net/Littlewood
We are two AI safety researchers (combined AF karma: around 200) who have been happily living as roommates for the past ~year in Reno, Nevada. We have an empty room in our house. The rent for the room is $400/mo. Some factors to consider re: whether you want to come live with us in Reno (either as a roommate, or starting your own house):
Reno is about as close to the Bay Area as you can get without paying state income tax. It costs about $20 to get a bus ticket down to the Bay. The bus has power and wifi, so you can work during the ~8 hour drive. (I’m not sure how good the wifi is, I try & use it as offline work time, but I tend to get distracted by the beautiful scenery.)
If you have a profoundly gifted child you want to send to school, or want to work with profoundly gifted kids, the nearby Davidson Academy is supposed to be the best place in the US for this.
Pre-pandemic, we had a nice SSC meetup going with a few people attending.
Short commute to Burning Man.
Police/community relations seem OK in Reno (maybe not in the adjacent city of Sparks though), with cops marching as part of anti-police-brutality protests. There was some rioting downtown the other month but not bad enough to make the national news, and the community came together to clean up afterwards. Our house is far from downtown anyways.
There are more gun people here than in California; almost 40% of households in Washoe County keep firearms in their homes: http://www.city-data.com/top2/co8.html Unclear if this is good or bad. Certainly should be easier to get a gun here than in California if you want one. There’s a mix of liberals and conservatives in the area; libertarians seem overrepresented. There’s a confederate flag bumper sticker or two in our rural neighborhood :/
There is snow and skiing in the winter. Latitude fairly similar to the Bay Area, so theoretically sunlight fairly similar as well?
Reno’s tagline is “The Biggest Little City in the World”, and it lives up to it in my opinion. At least pre-pandemic, passerby in Reno seemed way nicer than passerby in the Bay Area.
There have been some wildfires this year and we got advisory evacuations twice. (We’re out in the boondocks; I don’t believe anyone in the city got an advisory evacuation.) Air quality has been better than the Bay Area though.
A little about our house:
We’re hoping to add AI safety researchers to the house, potentially squeezing more than 3 people into 3 bedrooms, eventually splitting off into a second house (based on who is forming AI safety collaborations) and growing the Reno community that way.
We’re both fairly driven and independently motivated, “no excuses” types as opposed to the touchy-feely people who do Circling you meet in the Bay Area.
Cleaning up messes is usually not a priority at the current moment but this is susceptible to change.
We’re not super social. We communicate using Discord and occasionally go for hikes, do math/machine learning presentations on a whiteboard, play board games, drink and watch a movie, cook food and eat it together.
We have a dishwasher, laundry machine, dryer.
Please send a private message for more info.
The skin color study confirms that vitamin D doesn’t protect against infection. But I don’t see it saying anything clear about how much harm a person suffers if they’re infected.
Someone wanted to know about the outcome of my hair loss research so I thought I would quickly write up what I’m planning to try for the next year or so. No word on how well it works yet.
Most of the ideas are from this review: https://www.karger.com/Article/FullText/492035
I’m planning to replace oral caffeine consumption with caffeinated shampoo/conditioner: https://www.amazon.com/gp/product/B076FHJ3K5/ (In my experience caffeine is absorbed through the skin just fine)
Gonna start taking this as a source of procyanidin and zinc: https://www.amazon.com/gp/product/B00K7GIMCU/
I understand that the largest single ingredient in marine proteins is glycine, which also seems to be helpful for improving sleep and preventing diabetes per examine.com, so I’m going to start taking 3-4g of glycine before bed every night: https://www.amazon.com/gp/product/B01LBILPCG/ (Romeo pointed out that taking marine proteins harvested from marine creatures might involve consuming excess mercury etc. which is another reason to take synthetic glycine.)
Everyone on hair loss forums seems to love microneedling: https://www.amazon.com/gp/product/B07JMZXBGR/ Here is one guide: https://www.reddit.com/r/tressless/comments/a660h7/unofficial_dermapenroller_guide/
I thought I might as well try emu oil but I’ve been having trouble sourcing stuff that I’m relatively confident is low in trans fats.
2% ketoconazole shampoo at least until my supply runs out.
Once I’ve been able to figure out how effective all this stuff is, maybe I’ll buy this and see if it seems to make a difference when added to the above in 6 months, otherwise return it. This review was optimistic about lasers, although this webpage points out that the parameters may matter quite a bit; I chose the iRestore based on low cost rather than optimism about its parameters.
I think this should be lower cost, safer, and less sketchy than the big 3, but plausibly less effective on expectation; let me know if you disagree.
Logic and reason indicate the robustness of a claim, but you can have lots of robust, mutually-contradictory claims. A robust claim is one that contradicts neither itself nor other claims it associates with. The other half is how well it resonates with people. Resonance indicates how attractive a claim is through authority, consensus, scarcity, poetry, or whatever else.
Survive and spread through robustness and resonance. That’s what a strong claim does. You can state that you’ll only let a claim spread into your mind if it’s true, but the fact that it’s so common for two such people to hold contradictory claims indicates that their real metric is much weaker than truth. I’ll posit that the real metric in such scenarios is robustness.
Not all disagreements will separate cleanly into true/false categorizations. Godel proved that one.
There could be ways of making it legal given that we’re a non-profit with somewhat academic interests. (By “making” I mean actually changing the law or getting a No-Action Letter.) Most people who do gambling online do it for profit, which is where things get tricky.
I’ve been kicking around the idea of a ‘rationalist cult’, and am interested in the monastery idea.
My attempt to summarize the alignment concern here. Does this seem a reasonable gloss?
It seems plausible that competitive models will not be transparent or introspectable. If you can’t see how the model is making decisions, you can’t tell how it will generalize, and so you don’t get very good safety guarantees. Or to put it another way, if you can’t interact with the way the model is thinking, then you can’t give a rich enough reward signal to guide it to the region of model space that you want
I have already replied once but my post seems to have been deleted so I will try again.
I am unable to detect any nuance in your post. perhaps it would be simpler if you were to explicitly say exactly what you mean.
Most importantly, the success of the scheme relies on the correctness of the prior over helper models (or else the helper could just be another copy of GPT-Klingon)
I’m not sure I understand this. My understanding of the worry: what if there’s some equilibrium where the model gives wrong explanations of meanings, but I can’t tell using just the model to give me meanings.
But it seems to me that having the human in the loop doing prediction helps a lot, even with the same prior. Like, if the meanings are wrong, then the user will just not predict the correct word. But maybe this is not enough corrective data?
I like to imagine a future LessWrong with 1 million users, and not one pressing the button. That would be very inspiring and a strong signal of being a high trust culture.
Thanks for your explanation
This works great as an answer!
According to Fedex tracking, on Thursday, I will have a Biovyzr. I plan to immediately start testing it, and write a review.
What tests would people like me to perform?
Tests that I’m already planning to perform:
To test its protectiveness, the main test I plan to perform is a modified Bittrex fit test. This is where you create a bitter-tasting aerosol, and confirm that you can’t taste it. The normal test procedure won’t work as-is because it’s too large to use a plastic hood, so I plan to go into a small room, and have someone (wearing a respirator themselves) spray copious amounts of Bittrex at the input fan and at any spots that seem high-risk for leaks.
To test that air exiting the Biovyzr is being filtered, I plan to put on a regular N95, and use the inside-out glove to create Bittrex aerosol inside the Biovyzr, and see whether someone in the room without a mask is able to smell it.
I will verify that the Biovyzr is positive-pressure by running a straw through an edge, creating an artificial leak, and seeing which way the air flows through the leak.
I will have everyone in my house try wearing it (5 adults of varied sizes), have them all rate its fit and comfort, and get as many of them to do Bittrex fit tests as I can.
ah, I definitely agree yes, thank you for pointing that out. it was a quote I took from old notes back when I had a browser extension that would automatically change all pronouns to gender neutral pronouns, and I didn’t know at the time I would share that quote publicly, which is the source of the error. I fixed it.
Great comment, naturally. I appreciate your epistemic status quite a bit.
I think I want to respond to the idea that it’s contradictory and bad to signal that the Petrov Day button is serious and signal that it’s fun. A few examples:
HPMOR is a book about growing up, failure, and death. It’s also hilarious and riveting.
Unsong is 50% puns. It also contains a chapter describing hell and torture in some detail.
Embedded Agency is research done downstream of the potential for advanced optimizers to lead to an existential catastrophe for humans. It’s also a cute and colorful cartoon.
Previously warring countries often come together and have their sports teams play. It really matters that they don’t cheat and play honorably, even if it’s “fun” and “play”. It’s a game, but it’s not “just a game”.
Some animals sheath their claws for dominance fights, where the losing player loses real status but isn’t physically harmed. Again, it’s a game, and it’s in some ways play. And it’s also serious.
I’m here trying to build a community around the art of rationality. We also do an April Fools’ joke every year, like that time we made everyone’s font size proportional to their total karma for a day.
People have birthday parties, and their friends show up. And it means something for your friends to show up. It’s a party, and it’s also a true signal of friendship and being there for your friend, and you can be disappointed in them not showing up, even if it’s ‘fun’ and ‘just a party’.
I don’t think it’s contradictory to care about something deeply and to be playful with it. As with Feynman and physics, Eliezer and HPMOR, Scott and Unsong, Abram/Scott and Embedded Agency; also, LessWrong and Petrov Day.
Relatedly, (part of) you said that sometimes I signaled that this was serious, and sometimes I signaled that it was ‘just a game’. I think this is incorrect. I signaled it was playful, but never unserious, never ‘just’ or ‘merely’ a game. Everything I wrote was about trust, honor, extinction, gratitude, mourning, and being Beyond the Reach of God. I didn’t write anything that suggested you should consider pressing the button, or was secretly winking at the audience. Taking down the site is symbolic, but that doesn’t mean it’s ironic or a joke, they’re totally different. We’re symbolizing world-ending destructive technology, with a much lesser but still destructive technology. I care about this tradition a lot, and it was part of what was involved and in everything I wrote.
I feel like it takes a very cynical prior to read at everything I wrote, consider that I actually cared, then go “Yeah, he probably doesn’t mean this, why would someone actually care about this, he’s probably joking?” I don’t think anyone has that prior… I think they more have a prior that people rarely actually care about things, and so when they look at something that was meant straightforwardly and un-ironically, the hypothesis isn’t even brought up to attention.
But a lot of people got it. Most, I think. I got a bunch of short response messages to getting the codes (and some long ones) saying things like “I’m honored to be entrusted with the launch codes.” and “Roger that, general. I won’t let you down!” and “I’m honored that you gave me the opportunity” and “Awesome, thanks! I love the warm glow of not burning the commons.” I love getting these messages. And I love that it worked out last year.
It’s a distinct argument to say that I was unsuccessful at assuring that was communicated before someone took action, which this is some evidence of. Like, last year someone visited the site, pressed the button, and entered a string of zeros then hit ‘submit’, before finding out what the button even did, and without reading the announcement post. (We responded to that this year by having two sentences right underneath the button saying what was happening.)
I take seriously the charge that users like Chris would’ve gotten it if I had re-written the email and post in some ways, and I will definitely user-test it more next year. You say it’d be good to write down the case for the tradition; I can also do that, write a post called “Why the Petrov Day Big Red Button?”, and link to it from everywhere next year.
But… well, there’s more to say, but I have to go for now. I’ll add that it was my responsibility to pick the people and write the announcements. I entrusted 125 people with codes last year, and succeeded. I tried to do 270 this year, and I failed. I’m writing a postmortem, and I will work hard to ensure the site doesn’t go down next year.
This is only true for trivial values, e.g. “I terminally value having this specific world model”.
For most utility schemes (Including, critically, that of humans), the supermajority of the purpose of models and beliefs is instrumental. For example, making better predictions, using less computing power, etc.
In fact, humans who do not recognize this fact and stick to beliefs or models because they like them are profoundly irrational. If the sky is blue, I wish to believe the sky is blue, and so on. So, assuming that only prediction is valuable is not question begging- I suspect you already agreed with this and just didn’t realize it.
In the sense that beliefs (and the models they’re part of) are instrumental goals, any specific belief is “unnecessary”. Note the quotations around “unnecessary” in this comment and the comment you’re replying to. By “unnecessary” I mean the choice of which beliefs and which model to use is subject to the whims of which is more instrumentally valuable- in practice, a complex tradeoff between predictive accuracy and computational demands.
I have plenty of times heard of variables being infinitive in physics and I have seen people do calculus with infinitvely small numbers.
This fits with the idea that meaning comes from pleasure, and that great pleasure can be worth a fair amount of pain to achieve. The pain drains meaning away, but the redeeming factor is that it can serve as a test of the magnitude of pleasure, and generate pleasurable stories in the future.
An important counter argument to my hypothesis is how we may find a privileged “high road” to success and pleasure to be less meaningful. This at first might seem to suggest that we do inherently value pain.
In fact, though, what frustrates people about people born with a silver spoon in their mouths is that society seems set up to ensure their pleasure at another’s expense.
It’s not their success or pleasure we dislike. It’s the barriers and pain that we think it’s contextualized in. If pleasure for one means pain for another, then of course we find the pleasure to be less meaningful.
So this isn’t about short-term pain avoidance. It’s about long-term, overall, wise and systemic pursuit of pleasure.
And that pleasure must be not only in the physical experiences we have, but in the stories we tell about it—the way we interpret life. We should look at it, and see that it is good.
If people are wireheading, and we look at that tendency and it causes us great displeasure, that is indeed an argument against wireheading.
We need to understand that there’s no single bucket where pleasure can accumulate. There is a psychological reward system where pleasure is evaluated according to the sensory input and brain state.
Utilitarian hedonism isn’t just about nerve endings. It’s about how we interpret them. If we have a major aesthetic objection to wireheading, that counts from where we’re standing, no matter how much you rachet up the presumed pleasure of wireheading.
The same goes recursively for any “hack” that could justify wireheading. For example, say you posited that wireheading would be seen as morally good, if only we could find a catchy moral justification for it.
So we let our finest AI superintelligences get to work producing one. Indeed, it’s so catchy that the entire human population acquiesces to wireheading.
Well, if we take offense to the prospect of letting the AI superintelligence infect us with a catchy pro-wireheading meme, then that’s a major point against doing so.
In general “It pleases or displeases me to find action X moral” is a valid moral argument—indeed, the only one there is.
The way moral change happens is by making moral arguments or having moral experiences that in themselves are pleasing or displeasing.
What’s needed, then, for moral change to happen, is to find a pleasing way to spread an idea that is itself pleasing to adopt—or unpleasant to abandon. To remain, that idea needs to generate pleasure for the subscriber, or to generate displeasure at the prospect of abandoning it in favor of a competing moral scheme.
To believe in some notion of moral truth or progress requires believing that the psychological reward mechanism we have attached to morality corresponds best with moral schemes that accord with moral truth.
An argument for that is that true ideas are easiest to fashion into a coherent, simple argument. And true ideas best allow us to interface with reality to advantage. Being good tends to make you get along with others better than being bad, and that’s a more pleasant way to exist.
Hence, even though strong cases can be constructed for immoral behavior, truth and goodness will tend to win in the arms race for the most pleasing presentation. So we can enjoy the idea that there is moral progress and objective moral truth, even though we make our moral decisions merely by pursuing pleasure and avoiding pain.
See also: The Best Textbooks on Every Subject