plex
All AGI safety questions welcome (especially basic ones) [July 2022]
Hell yeah!
This matches my internal experience that caused me to bring a ton of resources into existence in the alignment ecosystem (with various collaborators):
aisafety.info—Man, there really should be a single point of access that lets people self-onboard into the effort. (Helped massively by Rob Miles’s volunteer community, soon to launch a paid distillation fellowship)
aisafety.training—Maybe we should have a unified place with all the training programs and conferences so people can find what to apply to? (AI Safety Support had a great database that just needed a frontend)
aisafety.world—Let’s make a map of everything in AI existential safety so people know what orgs, blogs, funding sources, resources, etc exist, in a nice sharable format. (Hamish did the coding, Superlinear funded it)
ea.domains—Wow, there sure are a lot of vital domains that could get grabbed by squatters. Let’s step in and save them for good orgs and projects.
aisafety.community—There’s no up-to-date list of online communities. This is an obvious missing resource.
Rob Miles videos are too rare, almost entirely bottlenecked on the research and scriptwriting process. So I built some infrastructure which allows volunteers to collaborate as teams on scripts for him, being tested now.
Ryan Kidd said there should be a nice professional site which lists all the orgs in a format which helps people leaving SERI MATS decide where to apply. aisafety.careers is my answer, though it’s not quite ready yet. Volunteers wanted to help write up descriptions for orgs in the Google Docs we have auto-syncing with the site!
Nonlinear wanted a prize platform, and that seemed pretty useful as a way to usefully use the firehose of money while FTXFF was still a thing, so I built Superlinear.
There are a lot of obvious low-hanging fruit here. I need more hands. Let’s make a monthly call and project database so I can easily pitch these to all the people who want to help save the world and don’t know what to do. A bunch of great devs joined!
and 6+ more major projects as well as a ton of minor ones, but that’s enough to list here.
I do worry I might be neglecting my actual highest EV thing though, which is my moonshot formal alignment proposal (low chance of the research direction working out, but much more direct if it does). Fixing the alignment ecosystem is just so obviously helpful though, and has nice feedback loops.
Anti-squatted AI x-risk domains index
I have taken the survey.
Donated £100.
This is a Heuristic That Almost Always Works, and it’s the one most likely to cut off our chances of solving alignment. Almost all clever schemes are doomed, but if we as a community let that meme stop us from assessing the object level question of how (and whether!) each clever scheme is doomed then we are guaranteed not to find one.
Security mindset means look for flaws, not assume all plans are so doomed you don’t need to look.
If this is, in fact, a utility function which if followed would lead to a good future, that is concrete progress and lays out a new set of true names as a win condition. Not a solution, we can’t train AIs with arbitrary goals, but it’s progress in the same way that quantilizers was progress on mild optimization.
I can verify that the owner of the blaked[1] account is someone I have known for a significant amount of time, that he is a person with a serious, long-standing concern with AI safety (and all other details verifiable by me fit), and that based on the surrounding context I strongly expect him to have presented the story as he experienced it.
This isn’t a troll.
- ^
(also I get to claim memetic credit for coining the term “blaked” for being affected by this class of AI persuasion)
- ^
Task: Why does Eliezer Yudkowsky think this approach is doomed?
Takes a high level description of an approach, spits out Yudkowsky’s reply to it. Ideally uses real examples of Yudkowsky telling people why their ideas are doomed.
Please pick up the task of finding these examples and collect the bounty.
ea.domains—Domains Free to a Good Home
I believe the effect you describe exists, but I think there are two effects which make it unclear that implementing your suggestions is an overall benefit to the average reader. Firstly, to summarize your position:
Each extra weird belief you have detracts from your ability to spread other, perhaps more important, wierd memes. Therefore normal beliefs should be preferred to some extent, even when you expect them to be less correct or less locally useful on an issue, in order to improve your overall effectiveness at spreading your most highly valued memes.
If you have a cluster of beliefs which seem odd in general then you are more likely to share a “bridge” belief with someone. When you meet someone who shares at least one strange belief with you, you are much more likely to seriously consider their other beliefs because you share some common ground and are aware of their ability to find truth against social pressure. For example, an EA vegan may be vastly more able to introduce the other EA memes to a non-EA vegan than a EA non-vegan. Since almost all people have at least some weird beliefs, and those who have weird beliefs with literally no overlap with yours are likely to not be good targets for you to spread positive memes to, increasing your collection of useful and justifiable weird memes may well give you more opportunities to usefully spread the memes you consider most important
Losing the absolute focus on forming an accurate map by making concessions to popularity/not standing out in too many ways seems epistemologically risky and borderline dark arts. I do agree that some situations that not advertizing all your weirdness at once may be a useful strategic choice, but am very wary of the effect putting too much focus on this could have on your actual beliefs. You don’t want to strengthen your own absurdity heuristic by accident and miss out on more weird but correct and important things.
While I can imagine situations the advice given is correct (especially the for interacting with domain limited policymakers, or people you have a good read on likely reactions to extra weirdness), recommending it in general seems not sufficiently justified and I believe would have significant drawbacks.
- 28 Nov 2014 12:40 UTC; 26 points) 's comment on You have a set amount of “weirdness points”. Spend them wisely. by (EA Forum;
Crony Beliefs
All AGI safety questions welcome (especially basic ones) [Sept 2022]
A reasonable process seems to be: Determine whether the neighbor would do a similar thing for you (or stranger in a similar situation) out of niceness using your highly advanced social modeling software and past experience with them, and mimic their expected answer. Co-operate by default if you have not enough information to simulate them accurately.
The rude/not rude thing is useful as a hint about whether they would agree to do something beyond basic requirements if they were on the other side.
Thus incentivising people to do nice things for each other. If they fake a lot of special needs, they better go out of their way to prove they’ll help other people with them.
I was asked to comment by Ben earlier, but have been juggling more directly impactful projects and retreats. I have been somewhat close to parts of the unfolding situation, including spending some time with both Alice, Chloe, and (separately) the Nonlinear team in-person, and communicating online on-and-off with most parties.
I can confirm some of the patterns Alice complained about, specifically not reliably remembering or following through on financial and roles agreements, and Emerson being difficult to talk to about some things. I do not feel notably harmed by these, and was able to work them out with Drew and Kat without much difficulty, but it does back up my perception that there were real grievances which would have been harmful to someone in a less stable position. I also think they’ve done some excellent work, and would like to see that continue, ideally with clear and well-known steps to mitigate the kinds of harms which set this in motion.
I have consistently attempted to shift Nonlinear away from what appears to me a wholly counterproductive adversarial emotional stance, with limited results. I understand that they feel defected against, especially Emerson, but they were in the position of power and failed to make sure those they were working with did not come out harmed, and the responses to the initial implosion continued to generate harm and distraction for the community. I am unsettled by the threat of legal action towards Lightcone and focus on controlling the narrative rather than repairing damage.
Emerson: You once said one of the main large failure modes you were concerned about becoming was Stalin’s mistake: breaking the networks of information around you so you were unaware things were going so badly wrong. My read is you’ve been doing this in a way which is a bit more subtle than the gulags, by the intensity of your personality shaping the fragments of mind around you to not give you evidence that in fact you made some large mistakes here. I felt the effects of this indirectly, as well as directly. I hope you can halt, melt, and catch fire, and return to the effort as someone who does not make this magnitude of unforced error.
You can’t just push someone who is deeply good out of the movement which has the kind of co-protective nature of ours in the way you merely shouldn’t in some parts of the world, if there’s intense conflict call in a mediator and try and heal the damage.
Edit: To clarify, this is not intended as a blanket endorsement of mediation, or of avoiding other forms of handling conflict. I do think that going into a process where the parties genuinely try and understand each other’s worlds much earlier would have been much less costly for everyone involved as well as the wider community in this case, but I can imagine mediation is often mishandled or forced in ways which are also counterproductive.
What was that supplement? Seems like a useful thing to have known if reproducible.
We’ve duped EVERYONE.
At age 5 I am told that every single morning as we drove to school I said to my mother that it was a waste of time. Shockingly, she listened, and after a year of this she had found out about home education and made arrangements for me to be released.
I am beyond glad to have avoided most of formal education, despite having been put back into it twice during my teenage years for several years each time. The difference between my motivation to learn, social fulfilment, and general wellbeing was dramatic.I am curious about what alternatives could be built with modern technology, and whether a message like this could spread enough to shift a notable fraction of children to freedom.
[Question] What risks concern you which don’t seem to have been seriously considered by the community?
Congratulations on launching!
Added you to the map:and your Discord to the list of communities, which is now a sub-page of aisafety.com.
One question: Given that interpretability might well lead to systems which are powerful enough to be an x-risk long before we have a strong enough understanding to direct a superintelligence, so publish-by-default seems risky, are you considering adopting a non-publish-by-default policy? I know you talk about capabilities risks in general terms, but is this specific policy on the table?
For convenience: Nate-culture communication handbook
Filled in, but did not do digit lengths because I have no access to a printer or scanner in the near future.