plex
DMed a link to an interface which lets you select system prompt and model (including Claude). This is open to researchers to test, but not positing fully publicly as it is not very resistant to people who want to burn credits right now.
Other researchers feel free to DM me if you’d like access.
We’re likely to switch to Claude 3 soon, but currently GPT 3.5. We are mostly expecting it to be useful as a way to interface with existing knowledge initially, but we could make an alternate prompt which is more optimized for being a research assistant brainstorming new ideas if that was wanted.
Would it be useful to be able to set your own system prompt for this? Or have a default one?
Seems like a useful tool to have available, glad someone’s working on it.
AI Safety Info’s answer to “I want to help out AI Safety without making major life changes. What should I do?” is currently:
It’s great that you want to help! Here are some ways you can learn more about AI safety and start contributing:
Learn More:
Learning more about AI alignment will provide you with good foundations for helping. You could start by absorbing content and thinking about challenges or possible solutions.
Consider these options:
Keep exploring our website.
Complete an online course. AI Safety Fundamentals is a popular option that offers courses for both alignment and governance. There is also Intro to ML Safety which follows a more empirical curriculum. Getting into these courses can be competitive, but all the material is also available online for self-study. More in the follow-up question.
Learn more by reading books (we recommend The Alignment Problem), watching videos, or listening to podcasts.
Join the Community:
Joining the community is a great way to find friends who are interested and will help you stay motivated.
Join the local group for AI Safety, Effective Altruism[1] or LessWrong. You can also organize your own!
Join online communities such as Rob Miles’s Discord or the AI Alignment Slack.
Write thoughtful comments on platforms where people discuss AI safety, such as LessWrong.
Attend an EAGx conference for networking opportunities.
Here’s a list of existing AI safety communities.
Donate, Volunteer, and Reach Out:
Donating to organizations or individuals working on AI safety can be a great way to provide support.
Donate to AI safety projects.
Help us write and edit the articles on this website so that other people can learn about AI alignment more easily. You can always ask on Discord for feedback on things you write.
Write to local politicians about policies to reduce AI existential risk
If you don’t know where to start, consider signing up for a navigation call with AI Safety Quest to learn what resources are out there and to find social support.
If you’re overwhelmed, you could look at our other article that offers more bite-sized suggestions.
Not all EA groups focus on AI safety; contact your local group to find out if it’s a good match. ↩︎
Life is Nanomachines
In every leaf of every tree
If you could look, if you could see
You would observe machinery
Unparalleled intricacy
In every bird and flower and bee
Twisting, churning, biochemistry
Sustains all life, including we
Who watch this dance, and know this key
Congratulations on launching!
Added you to the map:and your Discord to the list of communities, which is now a sub-page of aisafety.com.
One question: Given that interpretability might well lead to systems which are powerful enough to be an x-risk long before we have a strong enough understanding to direct a superintelligence, so publish-by-default seems risky, are you considering adopting a non-publish-by-default policy? I know you talk about capabilities risks in general terms, but is this specific policy on the table?
Yeah, that could well be listed on https://ea.domains/, would you be up for transferring it?
Internal Double Crux, a cfar technique.
I think not super broadly known, but many cfar techniques fit into the category so it’s around to some extent.
And yeah, brains are pretty programmable.
Right, it can be way easier to learn it live. My guess is you’re doing something quite IDC flavoured, but mixed with some other models of mind which IDC does not make explicit. Specific mind algorithms are useful, but exploring based on them and finding things which fit you is often best.
Nice, glad you’re getting value out of IDC and other mind stuff :)
Do you think an annotated reading list of mind stuff be worth putting together?
For convenience: Nate-culture communication handbook
Yup, there is a working prototype and a programmer who would like to work on it full time if there was funding, but it’s not been progressing much for the past year or so because no one has had the free bandwidth to work on it.
https://aisafety.world/tiles/ has a bunch.
That’s fair, I’ve added a note to the bottom of the post to clarify my intended meaning. I am not arguing for it in a well-backed up way, just stating the output of my models from being fairly close to the situation and having watched a different successful mediation.
Forced or badly done mediation seems indeed terrible, entering into conversation facilitated by someone skilled with an intent to genuinely understand the harms caused and make sure you correct he underlying patterns seems much less bad than the actual way the situation played out.
I was asked to comment by Ben earlier, but have been juggling more directly impactful projects and retreats. I have been somewhat close to parts of the unfolding situation, including spending some time with both Alice, Chloe, and (separately) the Nonlinear team in-person, and communicating online on-and-off with most parties.
I can confirm some of the patterns Alice complained about, specifically not reliably remembering or following through on financial and roles agreements, and Emerson being difficult to talk to about some things. I do not feel notably harmed by these, and was able to work them out with Drew and Kat without much difficulty, but it does back up my perception that there were real grievances which would have been harmful to someone in a less stable position. I also think they’ve done some excellent work, and would like to see that continue, ideally with clear and well-known steps to mitigate the kinds of harms which set this in motion.
I have consistently attempted to shift Nonlinear away from what appears to me a wholly counterproductive adversarial emotional stance, with limited results. I understand that they feel defected against, especially Emerson, but they were in the position of power and failed to make sure those they were working with did not come out harmed, and the responses to the initial implosion continued to generate harm and distraction for the community. I am unsettled by the threat of legal action towards Lightcone and focus on controlling the narrative rather than repairing damage.
Emerson: You once said one of the main large failure modes you were concerned about becoming was Stalin’s mistake: breaking the networks of information around you so you were unaware things were going so badly wrong. My read is you’ve been doing this in a way which is a bit more subtle than the gulags, by the intensity of your personality shaping the fragments of mind around you to not give you evidence that in fact you made some large mistakes here. I felt the effects of this indirectly, as well as directly. I hope you can halt, melt, and catch fire, and return to the effort as someone who does not make this magnitude of unforced error.
You can’t just push someone who is deeply good out of the movement which has the kind of co-protective nature of ours in the way you merely shouldn’t in some parts of the world, if there’s intense conflict call in a mediator and try and heal the damage.
Edit: To clarify, this is not intended as a blanket endorsement of mediation, or of avoiding other forms of handling conflict. I do think that going into a process where the parties genuinely try and understand each other’s worlds much earlier would have been much less costly for everyone involved as well as the wider community in this case, but I can imagine mediation is often mishandled or forced in ways which are also counterproductive.
How can I better recruit attention and resources to this topic?
Consider finding an event organizer/ops person and running regular retreats on the topic. This will give you exposure to people in a semi-informal setting, and help you find a few people with clear thinking who you might want to form a research group with, and can help structure future retreats.
I’ve had great success with a similar approach.
We’re getting about 20k uniques/month across the different URLs, expect that to get much higher once we make a push for attention when Rob Miles passes us for quality to launch to LW then in videos.
https://www.equistamp.com/evaluations has a bunch, including an alignment knowledge one they made.