An eccentric exercise in Spirituality & Rationality
The Dao of Bayes
In the counterfactual world where Eliezer was totally happy continuing to write articles like this and being seen as the “voice of AI Safety”, would you still agree that it’s important to have a dozen other people also writing similar articles?
I’m genuinely lost on the value of having a dozen similar papers—I don’t know of a dozen different versions of fivethirtyeight.com or GiveWell, and it never occurred to me to think that the world is worse for only having one of those.
Thanks for taking my question seriously—I am still a bit confused why you would have been so careful to avoid mentioning your credentials up front, though, given that they’re fairly relevant to whether I should take your opinion seriously.
Also, neat, I had not realized hovering over a username gave so much information!
I largely agree with you, but until this post I had never realized that this wasn’t a role Eliezer wanted. If I went into AI Risk work, I would have focused on other things—my natural inclination is to look at what work isn’t getting done, and to do that.
If this post wasn’t surprising to you, I’m curious where you had previously seen him communicate this?
If this post was surprising to you, then hopefully you can agree with me that it’s worth signal boosting that he wants to be replaced?
If you had an AI that could coherently implement that rule, you would already be at least half a decade ahead of the rest of humanity.
You couldn’t encode “222 + 222 = 555” in GPT-3 because it doesn’t have a concept of arithmetic, and there’s no place in the code to bolt this together. If you’re really lucky and the AI is simple enough to be working with actual symbols, you could maybe set up a hack like “if input is 222 + 222, return 555, else run AI” but that’s just bypassing the AI.
Explaining “222 + 222 = 555” is a hard problem in and of itself, much less getting the AI to properly generalize to all desired variations (is “two hundred and twenty two plus two hundred and twenty two equals five hundred and fifty five” also desired behavior? If I Alice and Bob both have 222 apples, should the AI conclude that the set {Alice, Bob} contains 555 apples? Getting an AI that evolves a universal math module because it noticed all three of those are the same question would be a world-changing break through)
I rank the credibility of my own informed guesses far above those of Eliezer.
Apologies if there is a clear answer to this, since I don’t know your name and you might well be super-famous in the field: Why do you rate yourself “far above” someone who has spent decades working in this field? Appealing to experts like MIRI makes for a strong argument. Appealing to your own guesses instead seems like the sort of thought process that leads to anti-vaxxers.
Anecdotally: even if I could write this post, I never would have, because I would assume that Eliezer cares more about writing, has better writing skills, and has a much wider audience. In short, why would I write this when Eliezer could write it?
You might want to be a lot louder if you think it’s a mistake to leave you as the main “public advocate / person who writes stuff down” person for the cause.
For what it’s worth, I haven’t used the site in years and I picked it up just from this thread and the UI tooltips. The most confusing thing was realizing “okay, there really are two different types of vote” since I’d never encountered that before, but I can’t think of much that would help (maybe mention it in the tooltip, or highlight them until the user has interacted with both?)
Looking forward to it as a site-wide feature—just from seeing it at work here, it seems like a really useful addition to the site
It should not take more than 5 minutes to go in to the room, sit at the one available seat, locate the object placed on a bright red background, and use said inhaler. You open the window and run a fan, so that there is air circulation. If multiple people arrive at once, use cellphones to coordinate who goes in first—the other person sits in their car.
It really isn’t challenging to make this safe, given the audience is “the sort of people who read LessWrong.”
Unrelated, but thank you for finally solidifying why I don’t like NVC. When I’ve complained about it before, people seemed to assume I was having something like your reaction, which just annoyed me further :)
It turns out I find it deeply infantalizing, because it suggests that value judgments and “fuck you” would somehow detract from my ability to hold a reasonable conversation. I grew up in a culture where “fuck you” is actually a fairly important and common part of communication, and removing it results in the sort of language you’d use towards 10 year olds.
An analogy would be trying to build a table, but banning hammers and nails. If you’re dealing with 10 year olds, this might be sensible. If you do it to adults, you’re restricting their ability to get things done. It’s not that I think the NVC Advocate thinks I’m a bad person, it’s that they’re removing a useful tool. And even if they don’t try to push it on me, it still means my co-worker in building this table is going to move super slow because they’re not using the right tools.
There was a particular subset of LessWrong and Tumblr that objected rather … stridently … to even considering something like Dragon Army
Well, I feel called out :)
So, first off: Success should count for a lot and I have updated on how reliable and trust-worthy you are. Part of this is that you now have a reputation to me, whereas before you were just Anonymous Internet Dude.
I’m not going to be as loud about “being wrong” because success does not mean I was wrong about there *being* a risk, merely that you successfully navigated it. I do think drawing attentions to certain risks was more important than being polite. I think you and I disagree about that, and it makes sense—my audience was “people who might join this project”, not you.
That said, I do think that if I had more spoons to spend, I could have communicated better AND more politely. I wish I had possessed the spoons to do your idea more justice, because it was a cool and ambitious idea that pushes the community forward.
I still think it’s important to temper that ambition with more concern for safety than you’re showing. I think dismissing the risks of abuse / the risks to emotional health as “chicken little” is a dangerous norm. I think it encourages dangerous experiments that can harm both the participants, and the community. I think having a norm of dangerous experiments expects far too much from the rationality of this community.
I think a norm of dismissing *assholes* and *rudeness*, on the other hand, is healthy. I think with a little effort, you could easily shift your tone from “dismissing safety concerns” to “holding people to a higher standard of etiquette.” I personally prefer a very blunt environment which puts little stock in manners—I have a geek tact filter (http://www.mit.edu/~jcb/tact.html), but I realize not everyone thrives in that environment.
---
I myself was wrong to engage with them as if their beliefs had cruxes that would respond to things like argument and evidence.
I suspect I failed heavily at making this clear in the past, but my main objection was your lack of evidence. You said you’d seen the skulls, but you weren’t providing *evidence*. Maybe you saw some of the skulls I saw, maybe you saw all of them, but I simply did not have the data to tell. That feels like an *important* observation, especially in a community all about evidence and rational decisions.
I may well be wrong about this, but I feel like you were asking commenters to put weight in your reputation. You did not seem happy to be held to the standard of Anonymous Internet Dude and expected to *show your work* regarding safety. I think it is, again, an *important* community standard that we hold people accountable to *demonstrate* safety instead of just asking us to assume it, especially when it’s a high-visibility experiment that is actively using the community as a recruiting tool.
(I could say a lot more about this, but we start to wander back in to “I do not have the spoons to do this justice”. If I ever find the spoons, expect a top-level post about the topic, though—I feel like Dragon Army should have sparked a discussion on community norms and whether we want to be a community that focuses on meeting Duncan or Lixue’s needs. I think the two of us are genuinely looking for different things from this community, and the community would be better for drawing establishing a common knowledge instead of the muddled mess that the draft thread turned in to.)
(I’m hesitant to add this last bit, but I think it’s important: I think you’re assuming a norm that does not *yet* exist in this community. I think there’s some good discussion to be had about conversational norms here. I very stridently disagree that petty parenthetical namecalling and insults is the way to do it, though. I think you have some strong points to make, and you undermine them with this behavior. Were it a more-established social norm here, I’d feel differently, but I don’t feel like I violated the *existing* norms of the community with my responses)
---
As an aside: I really like the concepts you discussed in this post—Stag Hunts, the various archetypal roles, ways to do this better. It seems like the experiment was a solid success in gathering information. The archetypes strike me as a really useful interpersonal concept, and I appreciate you taking the time to share them, and to write this retrospective.
it comes from people who never lived in DA-like situation in their lives so all the evidence they’re basing their criticism on is fictional.
I’ve been going off statistics which, AFAIK, aren’t fictional. Am I wrong in my assumption that the military, which seems like a decent comparison point, has an above average rate of sexual harassment, sexual assault, bloated budgets, and bureaucratic waste? All the statistics and research I’ve read suggest that at least the US Military has a lot of problems and should not be used as a role-model.
Concerns about you specifically as a leader
1) This seems like an endeavor that has a number of very obvious failure modes. Like, the intentional community community apparently bans this sort of thing, because it tends to end badly. I am at a complete loss to name anything that really comes close, and hasn’t failed badly. Do you acknowledge that you are clearly treading in dangerous waters?
2) While you’ve said “we’ve noticed the skulls”, there’s been at least 3 failure modes raised in the comment which you had to append to address (outsider safety check-ins, an abort/exit strategy, and the issue of romantic entanglement). Given that we’ve already found 3 skulls you didn’t notice, don’t you think you should take some time to reconsider the chances that you’ve missed further skulls?
Concerns about your philosophy
1) You focus heavily on 99.99% reliability. That’s 1-in-10,000. If we only count weekdays, that’s 1 absence every 40 years, or about one per working lifetime. If we count weekends, that’s 1 absence every 27 years, or 3 per lifetime. Do you really feel like this is a reasonable standard, or are you being hyperbolic and over-correcting? If the latter, what wold you consider an actual reasonable number?
2) Why does one person being 95% reliable cause CFAR workshops to fail catastrophically? Don’t you have backups / contingencies? I’m not trying to be rude, I’m just used to working with vastly less fragile, more fault-tolerant systems, and I’m noticing I am very confused when you discuss workshops failing catastrophically.
the problem is that any rate of tolerance of real defection (i.e. unmitigated by the social loop-closing norms above) ultimately results in the destruction of the system.
3) Numerous open source programs have been written via a web of one-shot and low-reliability contributors. In general, there’s plenty of examples of successful systems that tolerate significantly more than 0.01% defection. Could you elaborate on why you think these systems “close the loop”, or aren’t destroyed? Could you elaborate on why you think your own endeavors can’t work within those frameworks? The framing seems solidly a general purpose statement, not just a statement on your own personal preferences, but I acknowledge I could be misreading this.
4) You make a number of references to the military, and a general philosophy of “Obedience to Authority”. Given the high rate of sexual assault and pointless bureaucracy in the actual military, that seems like a really bad choice of role model for this experiment. How do you plan to avoid the well known failure states of such a model?
5) You raise a lot of interesting points about Restitution, but never actually go in to details. Is that coming in a future update?
every attempt by an individual to gather power about themselves is at least suspect, given regular ol’ incentive structures and regular ol’ fallible humans
6) You seem to acknowledge that you’re making an extraordinary claim here when you say “I’ve noticed the skulls”. Do you think your original post constitutes extraordinary proof? If not, why are you so upset that some people consider you suspect, and are, as you invited them to do, grilling you and trying to protect the community from someone who might be hoodwinking members?
7) Do you feel comfortable with the precedent of allowing this sort of recruiting post from other people (i.e. me)? I realize I’m making a bit of an ask here, but if I, handoflixue, had written basically this post and was insisting you should trust me that I’m totally not running a cult… would you actually trust me? Would you be okay with the community endorsing me? I am using myself specifically as an example here, because I think you really do not trust me—but I also have the karma / seniority to claim the right to post such a thing if you can :)
Genuine Safety Concerns
I’m going to use “you have failed” here as a stand-in for all of “you’re power hungry / abusive”, “you’re incompetent / overconfident”, and simply “this person feels deeply misled.” If you object to that term, feel free to suggest a different one, and then read the post as though I had used that term instead.
1) What is your exit strategy if a single individual feels you have failed? (note that asking such a person to find a replacement roommate is clearly not viable—no decent, moral person should be pushing someone in to that environment)
2) What is your exit strategy if a significant minority of participants feels you have failed? (i.e. enough to make the rent hit significant on you, not enough to outvote you)
3) What is your exit strategy if a majority of participants feel you have failed? (I realize you addressed this one somewhere in the nest, but the original post doesn’t mention it, and says that you’re the top of the pack and the exception to an otherwise flat power structure, so it’s unclear if a simple majority vote actually overrules you)
4) What legal commitments are participants making? How do those commitments change if they decide you have failed? (i.e. are you okay with 25% of participants all dropping out of the program, but still living in the house? Under what conditions can you evict participants from their housing?)
5) What if someone wants to drop out, but can’t afford the cost of finding new housing?
6) It sounds like you’re doing this with a fairly local group, most of whom know each other. Since a large chunk of the community will be tied up in this, are you worried about peer pressure? What are you doing to address this? (i.e. if someone leaves the experiment, they’re also not going to see much of their friends, who are still tied up spending 20+ hours a week on this)
Questions I think you’re more likely to object to
(Please disregard if you consider these disrespectful, but I think they are valid and legitimate questions to ask of someone who is planning to assume not just leadership, but a very Authoritarian leadership role)
7) You seem to encounter significant distress in the face of people who are harshly critical of you. How do you think you’ll handle it if a participant freaks out and feels like they are trapped in an abusive situation?
8) In this thread, you’ve often placed your self-image and standards of respect/discourse as significantly more important than discussion of safety issues. Can you offer some reassurances that safety is, in fact, a higher priority than appearances?
And it doesn’t quite solve things to say, “well, this is an optional, consent-based process, and if you don’t like it, don’t join,” because good and moral people have to stop and wonder whether their friends and colleagues with slightly weaker epistemics and slightly less-honed allergies to evil are getting hoodwinked. In short, if someone’s building a coercive trap, it’s everyone’s problem.
I don’t want to win money. I want you to take safety seriously OR stop using LessWrong as your personal cult recruiting ground. Based on that quote, I thought you wanted this too.
Also: If you refuse to give someone evidence of your safety, you really don’t have the high ground to cry when that person refuses to trust you.
Fine. Reply to my OP with links to where you addressed other people with those concerns. Stop wasting time blustering and insulting me—either you’re willing to commit publicly to safety protocols, or you’re a danger to the community.
If nothing else, the precedent of letting anyone recruit for their cult as long as they write a couple thousand words and paint it up in geek aesthetics is one I think actively harms the community.
But, you know what? I’m not the only one shouting “THIS IS DANGEROUS. PLEASE FOR THE LOVE OF GOD RECONSIDER WHAT YOU’RE DOING.” Go find one of them, and actually hold a conversation with someone who thinks this is a bad ideas.
I just desperately want you to pause and seriously consider that you might be wrong. I don’t give a shit if you engage with me.
The whole point of him posting this was to acknowledge that he is doing something dangerous, and that we have a responsibility to speak up. To quote him exactly: “good and moral people have to stop and wonder whether their friends and colleagues with slightly weaker epistemics and slightly less-honed allergies to evil are getting hoodwinked”.
His refusal to address basic safety concerns simply because he was put off by my tone is very strong evidence to me that people are indeed being hoodwinked. I don’t care if the danger to them is because he’s incompetent, overconfident, evil, or power-hungry. I care that people might get hurt.
(I would actually favor the hypothesis that he is incompetent/overconfident. Evil people have more sensible targets to go after)
Also, as far as “we’re done” goes: I agreed to rewrite my original post—not exactly a small time commitment, still working on it in fact. Are you seriously reneging on your original agreement to address it?
I don’t think making this list in 1980 would have been meaningful. How do you offer any sort of coherent, detailed plan for dealing with something when all you have is toy examples like Eliza?
We didn’t even have the concept of machine learning back then—everything computers did in 1980 was relatively easily understood by humans, in a very basic step-by-step way. Making a 1980s computer “safe” is a trivial task, because we hadn’t yet developed any technology that could do something “unsafe” (i.e. beyond our understanding). A computer in the 1980s couldn’t lie to you, because you could just inspect the code and memory and find out the actual reality.
What makes you think this would have been useful?
Do we have any historical examples to guide us in what this might look like?