Making a research platform for AI Alignment at https://ai-plans.com/
Come critique AI Alignment plans and get feedback on your alignment plan!
Iknownothing
Yup, that’s definitely something that can be argued by people Against during the Debate Stage!
And they might come to the same conclusion!
I’d also read Elementary Analysis before
I’m not a grad physics student- I don’t have a STEM degree, or the equivalent- I found the book very readable, nonetheless. It’s by far my favourite textbook- feels like it was actually written by someone sane, unlike most.
I’m really glad you wrote this!
I think you address an important distinction there, but I think there might be a further one to be made- in that how we measure/tell if a model is aligned in the first place.
There seems to be a growing voice which says that if a model’s output seems to be the output we might expect from an aligned AI, then it’s aligned.
I think it’s important to distinguish that from the idea that the model is aligned if you actually have a strong idea of what it’s values are, how it’s gotten them, etc.
I’m really excited to see this!!
I’d like it if this became embed-able so it could be used on ai-plans.com and on other sites!!
Goodness knows, I’d like to be able to get summaries and answers to obscure questions on some alignmentforum posts!
What do you think someone who knows about PDP knows that someone with a good knowledge of DL doesn’t?
And why would it be useful?
I think folks in AI Safety tend to underestimate how powerful and useful liability and an established duty of care would be for this.
I think calling things a ‘game’ makes sense to lesswrongers, but just seems unserious to non lesswrongers.
I don’t think a lack of IQ is the reason we’ve been failing at making AI sensibly. Rather, it’s a lack of good incentive making.
Making an AI recklessly is current much more profitable than not doing do- which imo, shows a flaw in the efforts which have gone towards making AI safe—as in, not accepting that some people have a very different mindset/beliefs/core values and figuring out a structure/argument that would incentivize people of a broad range of mindsets.
Hasn’t Eliezer Yudkowsky largely failed at solving alignment and getting other to solve alignment?
And wasn’t he largely responsible for many people noticing that AGI is possible and potentially highly fruitful?
Why would a world where he’s the median person be more likely to solve to solve alignment?
Update: Rob Miles will also be judging some critiques! He’ll be judging Communication!
Hi, I’m Kabir Kumar, the founder of AI-Plans.com, I’m happy to answer any questions you might have about the site or the Critique-a-Thon!
Hi, we’ve already made a site which does this!
Probably much better for health overall to have a bowl of veg and fruit at your table for easy healthy snacking (carrots, cucumber, etc)
Most of my knowledge on dependencies and addictions comes from a brief study I did on neurotransmitter’s roles in alcohol dependence/abuse while in school, for an EPQ, so I’m really not sure how much of this applies- also, a lot of my study was finding that my assumptions were in the wrong direction(I didn’t know about endorphins)- but I think a lot of the stuff on neurotransmitters and receptors holds across different areas- take it with some salt though.
Quitting cold turkey rarely ever works for addictions/dependencies. The vast majority of time the person has a big resurgence in the addiction.
The balance of dopamine/sensitivity of the dopamine receptors often takes time to shift back.
Tapering, I think for this reason, has been one of the most reliable ways of recovering from an addiction/dependence. I believe it’s been shown to have a 70% success rate.
Interestingly, the first study I found on tapering, which is testing tapering strips in assistance of discontinuing antidepressant use, also says 70% https://journals.sagepub.com/doi/full/10.1177/20451253211039327
Ever site I read on reducing alcohol dependency with tapering said something similar, back in the day.
When I say media, I mean social media, movies, videos, books etc- any type of recording or something that you believe you’re using as entertainment.
I’m trying this myself. Done singular days before, sometimes 2 or 3 days, but failed to keep it consistent. I did find that when I did it, my work output was far higher and greater quality, I had a much better sleeping schedule and was generally in a much more enjoyable mood.
I also ended up spending more time with friends and family, meeting new people, trying interesting things, spending time outdoors, etc.
This time I’m building up to it- starting with 1 media free hour a day, then 2 hours, then 3, etc.
I think building up to it will let me build new habits which will stick more.
A challenge for folks interested: spend 2 weeks without media based entertainment.
“CESI’s Artificial Intelligence Standardization White Paper released in 2018 states
that “AI systems that have a direct impact on the safety of humanity and the safety of life,
and may constitute threats to humans” must be regulated and assessed, suggesting a broad
threat perception (Section 4.5.7).42 In addition, a TC260 white paper released in 2019 on AI
safety/security worries that “emergence” (涌现性) by AI algorithms can exacerbate the
black box effect and “autonomy” can lead to algorithmic “self-improvement” (Section
3.2.1.3).43”
From https://concordia-consulting.com/wp-content/uploads/2023/10/State-of-AI-Safety-in-China.pdf
I disagree with this paragraph today: “A lot of what AI does currently, that is visible to the general public seems like it could be replicated without AI”
Thank you! Changed it to that!