Born too late to explore Earth; born too early to explore the galaxy; born just the right time to save humanity.
Ulisse Mini
Finally Entering Alignment
Thanks! I will definitely read those!
Thanks! Some other people recommended the Atlas Fellowship and I’ve applied. Regarding (9) I think I worded it badly, I meant reach out to local politicians (I thought the terms were interchangeable)
Read it, that study guide is really good, really motivates me to branch out since I’ve definitely overfocused on depth before and not done enough applications/”generalizing”
This also reminds me of Miyamoto Musashi’s 3rd principle: Become acquainted with every art
Noted
Don’t be afraid of the thousand-year-old vampire
Yeah I read about 1/3d of the proof of Cox’s theorem until I realized even if I followed every step I wouldn’t gain any intuition from it, then I skipped the rest
Some realizations about memory and learning I’ve been thinking about recently EDIT: here are some great posts on memory which are a deconfused version of this shortform (and written by EY’s wife!)
Anki (and SRS in general) is a tool for efficiently writing directed graph edges to the brain. thinking about encoding knowledge as a directed graph can help with making good Anki cards.
Memory techniques are somewhat-analogous to data structures as well, e.g. the link method corresponds to a doubly linked list
“Memory techniques” should be called “Memory principles” (or even laws).
The “Code is Data” concept makes me realize memorization is more widely applicable, you could e.g. memorize the algorithm for integration in calculus. Many “creative” processes like integration can be reduced to an algorithm.
Truely part of you is not orthogonal to memory
techniquesprinciples, it uses the fact that a densely connected graph is less likely to be disconnected from randomly deleting edges, similar to how the link and story methods. Just because you aren’t making silly images doesn’t mean you aren’t using the principles.(untested idea for math): Journal about your thought processes after solving each problem, then generalize to form a problem solving algorithm / checklist and memorize the algorithm
= finding shortest paths on a weighted directed graph, where the shortest path cost must be below some threshold :)
I’ll write some posts when I get stuff working, I feel a Sense That More Is Possible in this area, but I don’t want to write stuff till I can at least say it works well for me.
Upvoted because I think there should be more of a discussion around this then “Obviously getting normal people involved will only make things worse” (which seems kind of arrogant / assumes there are no good unknown unknowns)
Yes, I’m not convinced either way myself but here are some arguments against:
If the USA regulates AGI, China will get it first which seems worse as there’s less alignment-activity in China (as for US China coordination, lol, lmao)
Raising awareness of AGI Alignment also raises awareness of AGI. If we communicate the “AGI” part without the “Alignment” part we could speed up timelines
If there’s a massive influx of funding/interest from people who aren’t well informed, it could lead to “substitution hazards” like work on aligning weak models with methods that don’t scale to the superintelligent case (In climate change people substitute “solve climate change” to “I’ll reduce my own emissions” which is useless)
If we convince the public AGI is a threat, there could be widespread flailing (the bad kind) which reflects badly on Alignment researchers (e.g. if DeepMind researchers are receiving threats, their system 1 might generalize to “People worried about AGI are a doomsday cult and should be disregarded”)
Most of these I’ve heard from reading conversations on EleutherAI’s discord, Connor is typically the most pessimistic but some others are pessimistic too (Connor’s talk discusses substitution hazards in more detail)
TLDR: It’s hard to control the public once they’re involved. Climate change startups aren’t getting public funding, the public is more interested in virtue-signaling (In the climate case the public doesn’t really make things worse, but for AGI it could be different)
EDIT: I think I’ve presented the arguments badly, re-reading them I don’t find them convincing. You should seek out someone who presents them better.
Personally my process goes something like:
Click a citation/link on LW that sends me to a sequence post
Read the post, opening any interesting citations in new tabs
Repeat until I run out of time or run out of interesting citations (the latter never happens)
Functional Analysis Reading Group
People Power
To get a sense that more is possible consider
The AI box experiment, and its replication
Mentalists like Derren Brown (which is related to 1)
How the FBI gets hostages back with zero leverage (they aren’t allowed to pay ransoms)
(This is an excerpt from a post I’m writing which I may or may not publish. the link aggregation here might be useful in of itself)
It would be nice to be able to change 5 minutes to something else, I know this isn’t in the spirit of the “try harder luke”, but 5 minutes is arbitrary, it could just as easily have been 10 minutes.
[Question] Systems Biology for self study
Interesting, I’m homeschooled (unschooled specifically) and that probably benefited my agency (though I could still be much more agentic). I guess parenting styles matter a lot more then surface level “going to school”
You’re super brave for sharing this, it’s hard to stand up and say “Yes I’m the stereotypical example of the problem mentioned here”, stay optimistic though; people starting lower have risen higher.
Those who take delight in their own might are merely pretenders to power. The true warrior of fate needs no adoration or fear, no tricks or overwhelming effort; he need not be stronger or smarter or innately more capable than everyone else; he need not even admit it to himself. All he needs to do is to stand there, at that moment when all hope is dead, and look upon the abyss without flinching.
I think even without point #4 you don’t necessarily get an AI maximizing diamonds. Heuristically, it feels to me like you’re bulldozing open problems without understanding them (e.g. ontology identification by training with multiple models of physics, getting it not to reward-hack by explicit training, etc.) all of which are vulnerable to a deceptively aligned model (just wait till you’re out of training to reward-hack). Also, every time you say “train it by X so it learns Y” you’re assuming alignment (e.g. “digital worlds where the sub-atomic physics is different, such that it learns to preserve the diamond-configuration despite ontological confusion”)
IMO shard theory provides a great frame to think about this in, it’s a must-read for improving alignment intuitions.
In regard to priorities between young frontline workers and the at-risk elderly. I hope they’re optimizing for saving life-years, and not lives (ie. if a healthy 20yo has 60yrs ahead of them, and a healthy 70yo has 10yrs ahead of them, saving the 20yo saves 6x as many life-years)
Other than that interesting post, I’ll be keeping an eye on that new strain.