Focus on the places where you feel shocked everyone’s dropping the ball
Writing down something I’ve found myself repeating in different conversations:
If you’re looking for ways to help with the whole “the world looks pretty doomed” business, here’s my advice: look around for places where we’re all being total idiots.
Look for places where everyone’s fretting about a problem that some part of you thinks it could obviously just solve.
Look around for places where something seems incompetently run, or hopelessly inept, and where some part of you thinks you can do better.
Then do it better.
For a concrete example, consider Devansh. Devansh came to me last year and said something to the effect of, “Hey, wait, it sounds like you think Eliezer does a sort of alignment-idea-generation that nobody else does, and he’s limited here by his unusually low stamina, but I can think of a bunch of medical tests that you haven’t run, are you an idiot or something?” And I was like, “Yes, definitely, please run them, do you need money”.
I’m not particularly hopeful there, but hell, it’s worth a shot! And, importantly, this is the sort of attitude that can lead people to actually trying things at all, rather than assuming that we live in a more adequate world where all the (seemingly) dumb obvious ideas have already been tried.
Or, this is basically my model of how Paul Christiano manages to have a research agenda that seems at least internally coherent to me. From my perspective, he’s like, “I dunno, man, I’m not sure I can solve this, but I also think it’s not clear I can’t, and there’s a bunch of obvious stuff to try, that nobody else is even really looking at, so I’m trying it”. That’s the sort of orientation to the world that I think can be productive.
Or the shard theory folks. I think their idea is basically unworkable, but I appreciate the mindset they are applying to the alignment problem: something like, “Wait, aren’t y’all being idiots, it seems to me like I can just do X and then the thing will be aligned”.
I don’t think we’ll be saved by the shard theory folk; not everyone audaciously trying to save the world will succeed. But if someone does save us, I think there’s a good chance that they’ll go through similar “What the hell, are you all idiots?” phases, where they autonomously pursue a path that strikes them as obviously egregiously neglected, to see if it bears fruit. (Regardless of what I think.)
Contrast this with, say, reading a bunch of people’s research proposals and explicitly weighing the pros and cons of each approach so that you can work on whichever seems most justified. This has more of a flavor of taking a reasonable-sounding approach based on an argument that sounds vaguely good on paper, and less of a flavor of putting out an obvious fire that for some reason nobody else is reacting to.
I dunno, maybe activities of the vaguely-good-on-paper character will prove useful as well? But I mostly expect the good stuff to come from people working on stuff where a part of them sees some way that everybody else is just totally dropping the ball.
In the version of this mental motion I’m proposing here, you keep your eye out for ways that everyone’s being totally inept and incompetent, ways that maybe you could just do the job correctly if you reached in there and mucked around yourself.
That’s where I predict the good stuff will come from.
And if you don’t see any such ways?
Then don’t sweat it. Maybe you just can’t see something that will help right now. There don’t have to be ways you can help in a sizable way right now.
I don’t see ways to really help in a sizable way right now. I’m keeping my eyes open, and I’m churning through a giant backlog of things that might help a nonzero amount—but I think it’s important not to confuse this with taking meaningful bites out of a core problem the world is facing, and I won’t pretend to be doing the latter when I don’t see how to.
Like, keep your eye out. For sure, keep your eye out. But if nothing in the field is calling to you, and you have no part of you that says you could totally do better if you deconfused yourself some more and then handled things yourself, then it’s totally respectable to do something else with your hours.
If you don’t have an active sense that you could put out some visibly-raging fires yourself (maybe after skilling up a bunch, which you also have an active sense you could do), then I recommend stuff like cultivating your ability to get excited about things, and doing other cool stuff.
Sure, most stuff is lower-impact than saving the world from destruction. But if you can be enthusiastic about all the other cool ways to make the world better off around you, then I’m much more optimistic that you’ll be able to feel properly motivated to combat existential risk if and when an opportunity to do so arises. Because that opportunity, if you get one, probably isn’t going to suddenly unlock every lock on the box your heart hides your enthusiasm in, if your heart is hiding your enthusiasm.
See also Rob Wiblin’s “Don’t pursue a career for impact — think about the world’s most important, tractable and neglected problems and follow your passion.”
Or the Alignment Research Field Guide’s advice to “optimize for your own understanding” and chase the things that feel alive and puzzling to you, as opposed to dutifully memorizing other people’s questions and ideas. “[D]on’t ask “What are the open questions in this field?” Ask: “What are my questions in this field?””
I basically don’t think that big changes come from people who aren’t pursuing a vision that some part of them “believes in”, and I don’t think low-risk, low-reward, modest, incremental help can save us from here.
To be clear, when I say “believe in”, I don’t mean that you necessarily assign high probability to success! Nor do I mean that you’re willing to keep trying in the face of difficulties and uncertainties (though that sure is useful too).
English doesn’t have great words for me to describe what I mean here, but it’s something like: your visualization machinery says that it sees no obstacle to success, such that you anticipate either success or getting a very concrete lesson.
The possibility seems open to you, at a glance; and while you may suspect that there’s some hidden reason that the possibility is not truly open, you have an opportunity here to test whether that’s so, and to potentially learn why this promising-looking idea fails.
(Or maybe it will just work. It’s been known to happen, in many a scenario where external signs and portents would have predicted failure.)
- AI alignment researchers don’t (seem to) stack by 21 Feb 2023 0:48 UTC; 173 points) (
- Hashing out long-standing disagreements seems low-value to me by 16 Feb 2023 6:20 UTC; 119 points) (
- I don’t think MIRI “gave up” by 3 Feb 2023 0:26 UTC; 100 points) (
- EA & LW Forum Weekly Summary (30th Jan − 5th Feb 2023) by 7 Feb 2023 2:13 UTC; 21 points) (EA Forum;
- Is AI Safety dropping the ball on privacy? by 18 Mar 2023 10:56 UTC; 21 points) (
- What do you think is wrong with rationalist culture? by 10 Mar 2023 13:17 UTC; 16 points) (
- 12 Feb 2023 14:33 UTC; 6 points)'s comment on Elements of Rationalist Discourse by (
- EA & LW Forum Weekly Summary (30th Jan − 5th Feb 2023) by 7 Feb 2023 2:13 UTC; 3 points) (
- A BOTEC estimating the effects of an AI capabilities project on AI timelines, unaligned AI, and human extinction by 5 Feb 2023 11:26 UTC; -1 points) (
This matches my internal experience that caused me to bring a ton of resources into existence in the alignment ecosystem (with various collaborators):
aisafety.info—Man, there really should be a single point of access that lets people self-onboard into the effort. (Helped massively by Rob Miles’s volunteer community, soon to launch a paid distillation fellowship)
aisafety.training—Maybe we should have a unified place with all the training programs and conferences so people can find what to apply to? (AI Safety Support had a great database that just needed a frontend)
aisafety.world—Let’s make a map of everything in AI existential safety so people know what orgs, blogs, funding sources, resources, etc exist, in a nice sharable format. (Hamish did the coding, Superlinear funded it)
ea.domains—Wow, there sure are a lot of vital domains that could get grabbed by squatters. Let’s step in and save them for good orgs and projects.
aisafety.community—There’s no up-to-date list of online communities. This is an obvious missing resource.
Rob Miles videos are too rare, almost entirely bottlenecked on the research and scriptwriting process. So I built some infrastructure which allows volunteers to collaborate as teams on scripts for him, being tested now.
Ryan Kidd said there should be a nice professional site which lists all the orgs in a format which helps people leaving SERI MATS decide where to apply. aisafety.careers is my answer, though it’s not quite ready yet. Volunteers wanted to help write up descriptions for orgs in the Google Docs we have auto-syncing with the site!
Nonlinear wanted a prize platform, and that seemed pretty useful as a way to usefully use the firehose of money while FTXFF was still a thing, so I built Superlinear.
There are a lot of obvious low-hanging fruit here. I need more hands. Let’s make a monthly call and project database so I can easily pitch these to all the people who want to help save the world and don’t know what to do. A bunch of great devs joined!
and 6+ more major projects as well as a ton of minor ones, but that’s enough to list here.
I do worry I might be neglecting my actual highest EV thing though, which is my moonshot formal alignment proposal (low chance of the research direction working out, but much more direct if it does). Fixing the alignment ecosystem is just so obviously helpful though, and has nice feedback loops.
I’ve kept updating in the direction of do a bunch of little things that don’t seem blocked/tangled on anything even if they seem trivial in the grand scheme of things. In the process of doing those you will free up memory and learn a bunch about the nature of the bigger things that are blocked while simultaneously revving your own success spiral and action-bias.
Yeah, that makes a lot of sense and fits my experience of what works.
I like this post, with one exception: I don’t think putting out fires feels like putting out fires. I think it feels like you’re utterly confused, and instead of nodding your head when you explain the confusion & people try to resolve it but you don’t understand them, continuing to actively notice & chase the confusion no matter how much people decrease your status due to you not being able to understand what they’re saying. It feels far more similar to going to school wearing a clown suit than heroically putting out obvious-to-you fires.
Upvoted but strong disagree.
I think “putting out fires” has more of the correct connotations—insofar as I’m correctly identifying what Nate means, it feels more like defiance and agency than anything about status. I know fully well that most of the fires I’m addressing/have addressed are not considered fires by other people (or they would’ve put them out already)! It feels like being infuriated that no one is doing the obvious thing and almost everyone I talk to is horribly unreasonable about this, so it’s time to roll up my sleeves and go to work.
On the other hand, I think going to school wearing a clown suit has many wrong connotations. It brings in feelings of shame and self-consciousness, when the appropriate emotion is (imo) defiance and doing the blazing obvious thing! I don’t think the Shard Theory folk think they are wearing a clown suit; in my interactions with him I feel like Alex Turner tends to be more defiant or infuriated than self-conscious. (Feel free to correct me if this is not the case!)
Shard theory did have some clown suit energy at first. Shard theory / disagreeing strongly with Eliezer (!) felt like wearing a clown suit, to some part of me, but the rest of me didn’t care.
I also felt something like “if I can’t communicate these ideas or am not willing to burn status to get eyes on them, there was no point in my having had status anyways.” From Looking back on my alignment PhD:
These days, I do feel more defiant/irritated, and not like I’m wearing a clown suit.
“Is there really something there with shard theory?” does not feel like a live question to me anymore, because it resolved “yes”, in my view. But I also have closed off the more optimistic ends of my uncertainty, where I thought there was a ~5% chance of quickly and knowably-to-me solving alignment.
Yeah, I resonate very strongly with this feeling as well! The whole reason to have generic resources is to spend them on useful things!
Upvoted but disagreed. It isn’t my model of what putting out fires feels like most of the time but I’m not sure, it’s plausible, and if it’s true it’s important.
It also makes me think that maybe it’s super, critically important to have social norms that make wearing a clown suit not so bad. There are downsides to this of course but if the importance of wearing a clown suit is that high it probably outweighs the downsides enough such that the optimal point on the spectrum is pretty close to “not too uncomfortable wearing the suit”.
Sometimes people are confused because their model is worse that everyone else’s (i.e. of everyone involved in given situation). Sometimes people are confused because their model is better… and they noticed a problem that other’s do not see yet, but they do not yet know how to solve the problem themselves.
What you describe sounds to me like someone who sees a problem, cannot solve it fully, but at least has a few guesses how to reduce it in short term, so keeps doing at least that. While other people either genuinely do not see the problem, or have a vague idea but also see very clearly the status cost of acting worried while everyone else remains calm.
This was a big driver behind my vegan nutrition project: I could not believe people were strongly pushing drastic diet changes without any concern for nutrition, when the mitigations were costless morally and almost so resource-wise.
Ooh, this could be useful to me, thank you!
I adore this post.
Basically everything I’ve done that I think has been helpful at all has been the result of chasing the things that feel alive and puzzling to me.
When I feel stagnated, I very often find that it’s because I’ve been thinking too much in the frame of “the alignment problem as other people see it”.
What sort of tests might these be, can you say? Eliezer is certainly not the only one with “low stamina” problems, and if there are medical tests to run that he wouldn’t already have had done, I’d like to know about them!
This list is quite good—https://mecfsroadmap.altervista.org/ Feel free to DM me if you want to chat more.
Me too! Is there a list somewhere of ‘tests to run in case of fatigue/low energy’?
There is thyroid, diabetes, anaemia, sleep apnoea, and....
Relatedly on “obviously dropping the ball”: has Eliezer tried harder stimulants? With his P(doom) & timelines, there’s relatively little downside to this done in reasonable quantities I think. Seems very likely to help with fatigue
From what I’ve read, main warning would be to get harder blocks on whatever sidetracks eliezer (e.g. use friends to limit access, have a child lock given to a trusted person, etc)
I think this is related to my relative optimism about people spending time on approaches to alignment that are clearly not adequate on their own. It’s not that I’m particularly bullish on the alignment schemes themselves, it’s that don’t think I’d realized until reading this post that I had been assuming we all understood that we don’t know wtf we’re doing so the most important thing is that we all keep an eye out for more promising threads (or ways to support the people following those threads, or places where everyone’s dropping the ball on being prepared for a miracle, or whatever). Is this… not what’s happening?
No by default.
I did not have this mindset right away. When I was new to AI Safety I though it would require much more experience before I was qualified to question the consensus, because that is the normal situation, in all the old sciences. I knew AI Safety was young, but I did not understand the implications at first. I needed someone to prompt me to get started.
Because I’ve run various events and co-founded AI Safety Support, I’ve talked to loooots of AI Safety newbies. Most people are too causes when it comes to believing themselves and too ready to follow authorities. It’s usually only takes a short conversation pointing out how incredibly young AI Safety is, and what that means, but many people do need this one push.
I don’t have as strong intuitions about this as So8res does, but I do think this is a useful heuristic. The post feels both useful epistemically, and motivationally. I liked reading comments from other people describing plans they embarked on because it seemed like “holy shit, is nobody doing this?”
I do kinda wish I had more than vague intuitions backing this up. Ironically, while I’d be interested in someone studying “What motivations tend to drive the largest effect sizes on humanity? How do you control for survivorship bias? Is So8res right about this being a useful prompt?”… it neither gives me a strong sense of “geez, everyone is idiotically dropping the ball on this, I believe in my heart this is the best thing” nor seem really like the top result of a measured, careful spreadsheet of possible goals.
(But, you’re reading this and thinking either “man, people are idiotically dropping the ball not having done a rigorous analysis of this” or “man I think So8res is wrong that you need to believe in your goals for them to be particularly useful, and my careful spreadsheet of goals says that measuring this effect is the best use of my time”, um, I’m interested in what you find)
FWIW, I think questions like “what actually causes globally consequential things to happen or not happen” are one of the areas in which we’re most dropping the ball. (AI Impacts has been working on a few related question, more like “why do people sometimes not do the consequential thing?”)
I think it’s good to at least spot check and see if there are interesting patterns. If “why is nobody doing X???” is strongly associated with large effects, this seems worth knowing, even if it doesn’t constitute a measure of expected effect sizes.
For the record: The kind of internal experience you describe is a good description of how things currently feel to me (when I look at alignment discourse).
My internal monologue is sort of like:
I’m working on posts with more well-developed versions of these ideas, where I also try to explain things better and more quickly than I’ve done previously. In the meantime, the best summary that I currently can point people to are these tweet-threads:
I’ve always wondered about things in this general area. Higher levels of action that improve the productivity of alignment researchers (well not just researchers, anyone in the field) seems like a very promising avenue to explore.
For example, I know that for me personally, “dealing with dinner” often takes way longer than I hope, consumes a lot of my time, and makes me less productive. That’s a problem that could easily be solved with money (which I’m working towards). Do alignment researchers also face that problem? If so it seems worth solving.
Continuing that thought, some people find cooking to be relaxing and restorative but what about things like cleaning, paperwork, and taxes? Most people find that to be somewhat stressful, right? And reducing stress helps with productivity, right? So maybe some sort of personal assistant a la The 4-Hour Work Week for alignment researchers would make sense.
And for medical stuff, some sort of white glove membership like what Tim Urban describes + resurrecting something like MetaMed to be available as a service for higher-impact people like Eliezer also sounds like it’d make sense.
Or basically anything else that can improve productivity. I was gonna say “at a +ROI” or something, but I feel like it almost always will be. Improved productivity is so valuable, and things like personal assistants are relatively so cheap. It reminds me of something I heard once about rich businesspeople needing private yachts: if the yacht leads to just one more closed deal at the margin then it paid for itself and so is easily worth it. Maybe alignment researchers should be a little more “greedy” in that way.
A different way to improve productivity would be through better pedagogy. Something I always think back to is that in dath ilan “One hour of instruction on a widely-used subject got the same kind of attention that an hour of prime-time TV gets on Earth”. I don’t get the sense that AI safety material is anywhere close to that level. Bringing it to that point would mean that researchers—senior, junior, prospective—would have an easier time going through the material, which would improve their productivity.
I’m not sure how impactful it would be to attract new researchers vs empowering existing ones, but if attracting new researchers is something that would be helpful I suspect that career guidance sorts of things would really yield a lot of new researchers.
Well, I had “smart SWE at Google who is interested in doing alignment research” in mind here. Another angle is recruiting top mathematicians and academics like Terry Tao. I know that’s been discussed before and perhaps pursued lightly, but I don’t get the sense that it’s been pursued heavily. Being able to recruit people like Terry seems incredibly high impact though. At the very least it seems worth exploring the playbooks of people in related fields like executive recruiters and look for anything actionable.
Probably more though. If you try to recruit an individual like Terry there’s an X% chance of having a Y% impact. OTOH, if you come across a technique regarding such recruitment more generally, then it’s an X% chance of finding a technique that has a Y% chance of working on Z different people. Multiplying by Z seems kinda big, and so learning how to “do recruitment” seems pretty worthwhile.
A lot of this stuff requires money. Probably a lot of it. But that’s a very tractable problem, I think. And maybe establishing that ~$X would yield ~Y% more progress would inspire donors. Is that something that has been discovered before in other fields?
Or maybe funding is something that is already in abundance? I recall hearing that this is the case and that the limitation is ideas. That never made sense to me though. I see a lot of things like those white glove medical memberships that seem obviously worthwhile. Are all the alignment researchers in NYC already members of The Lanby for $325/month? Do they have someone to clean their apartments? If not and if funding truly is abundant, then I “feel shocked that everyone’s dropping the ball”.
Funding is not truly abundant.
There are people who have above zero chance of helping that don’t get upskilling grants or research grants.
There are several AI Safety orgs that are for profit in order to get investment money, and/or to be self sufficient, because given their particular network, it was easier to get money that way (I don’t know the details of their reasoning).
I would be more efficient if I had some more money and did not need to worry about budgeting in my personal life.
I don’t know to what extent this is due to the money not existing, or it’s due to grant evaluation is hard, and there are some reason to not give out money to easily.
Cooking is a great example. People eat every day; even small costs (both time and money) are multiplied by 365. Rationalists in Bay Area are likely to either live together, or work together, so the distribution could also be trivial: bring your lunch box to work. So if you are bad at research, but good at cooking, you could contribute indirectly by preparing some tasty and healthy meals for the researchers.
(Possible complications: some people would want vegan meals, or paleo meals, could have food allergies, etc. Still, if you cooked for 80% of them, that could make them more productive.)
Or generally, thinking about things, and removing trivial inconveniences. Are people more likely to exercise during a break, if you bring them some weights?
Sometimes money alone is not enough, because you still have the principal-agent problem.
Yeah, the important thing, if he was approached and refused, would be to know why. Then maybe we can do something about it, and maybe we can’t. But if we approach 10 people, hopefully we will be able to make at least one of them happy somehow.
Ah, great point. That makes a lot of sense. I was thinking about things that are known to be important like exercise and sleep but wasn’t really seeing ways to help people with that but trivial inconveniences seem like a problem that people have and is worth solving. I’d think the first step would be either a) looking at existing research/findings for what these trivial inconveniences are likely to be or maybe b) user interviews.
Yes, absolutely. It reminds me a little bit of Salesforce. Have a list of leads; talk to them; or the ones that don’t work out add notes discussing why; over time go through the notes and look for any learnings or insights. (I’m not actually sure if salespeople do this currently.)
Maybe not everyone
The Productivity Fund (nonlinear.org)
Although this project has been “Coming soon!” for several months now. If you want to help with the non-dropping of this ball, you could check in with them to see if they could use some help.
I feel shocked that so little effort is being put into human genetic enhancement, relative to its potential. Everyone here seems focused on AI!
Strongly agree with this one. It’s pretty clear from plant breeding and husbandry that one can push any given trait tens of standard deviations from its natural mean even just using brain-dead selective breeding techniques. Research from Shai Carmi, Steve Hsu and others has shown that most traits are relatively independent from another (meaning most alleles that affect one trait don’t affect another trait). And most genes have a linear effect: they increase or decrease some trait by an amount, and don’t require some gene-gene interaction term to model.
Together these suggest that we could likely increase positive traits in humans such as prosocial behavior, intelligence, health and others by gigantic amounts simultaneously.
This is already possible to a limited degree using IVF and polygenic predictors already available. A gain of perhaps 0.2-1 standard deviations on a variety of traits is already feasible using simple embryo selection alone.
I’ve been working on a guide for people to do this for almost a year now. It has been an incredibly involved research project, mostly because I’ve spent a huge amount of time trying to quantify which IVF clinics in the US are best and how large of an advantage picking a really good one can give you.
Embryo selection to reduce disease, increase intelligence, and reduce dark tetrad traits in future generations is just such an obvious no-brainer. The expected medical savings alone are in the hundreds of thousands of dollars. Throw in extra earnings from a higher IQ, and reduced societal costs from less crime, greater community bonds etc and you may understand why I think genetic engineering holds such incredible promise.
Not enough time. I was researching intelligence enhancement of adults via genetically engineering delivered hy viral vectors. Then my timelines for AGI got shorter than my time estates for having a testable prototype. I switched to directly studying machine learning. That was like 7 years ago. We probably have less than 5 years left. That’s not much time for genetic engineering projects.
That’s probably true. I’m taking a gamble that only pays off in world where biological brains matter for at least another 30 years. But given the size of the potential impact and the neglected mess, I think it is a gamble worth taking.
I would be completely on-board with this if there was a method of improvement other than IVF embryo selection, since I consider human embryos to have moral value. Even if you don’t, unless you’re very sure of your position, I’d ask you to reconsider on the basis of the precautionary principle alone—i.e. if you’re wrong, then you’d be creating a huge problem.
I’d give us 50% odds of developing the technology capable of human genetic enhancement without excess embryos in the next decade. Editing looks like the most plausible candidate, though chromosome selection also looks pretty feasible.
I’ve given a lot of thought to the question of whether discarding embryos is acceptable. Maybe I’ll write a post about this at some point, but I’ll try to give a quick summary:
At the time of selection, human embryos have about 100 cells. They have no brain, no heart, and no organs. They don’t even have a nervous system. If they stopped development and never grew into humans, we would give them zero moral weight. Unless you believe that the soul enters the embryo during fertilization, the moral importance of an embryo is entirely down to its potential to develop into a human.
The potential of any given pairing of egg and sperm is almost unchanged after fertilization. A given pairing of sperm and egg will produce the same genome every time. I don’t see a clear line at fertilization regarding the potential of a particular sperm/egg coupling.
Roughly a third of regular non-IVF pregnancies end in miscarriage; usually before the mother even knows she’s pregnant. The rate of miscarriage approaches 100% towards a woman’s late 40s. If embryos are morally equivalent to babies, there is a huge ongoing preventable moral disaster going on during normal conception, to the point where one could make a case that unprotected sex between 40 and menopause is immoral.
Question, can we ever get somatic gene editing that is as good or better than having to edit the gametes?
No. You might get something that works, but it will never be as good as intervening at the gamete or embryo stage simply because half the genes you’d want to change are only active during development (ie before adulthood).
I’m I get the same effects you would need really crazily advanced biotech that could somehow edit the genes and replay the development stage of life without interfering with the current functioning of the organism. I don’t see anything like that being developed in the next 50 years without some kind of strong intelligence (whether artificial or biological in nature).
Then we have the reason why so little effort is going to genetic engineering on LW: The viable options here are way too slow, and the fast options are very weak, relative to how fast the world is changing.
Thus, it’s worth waiting so that we can reconsider our options later.
Yes, I completely understand why there is MORE interest in alignment, engineered pandemics and nuclear war. I think that is correct. But I don’t think the balance is quite right. Genetic engineering could be a meta-level solution to all those problems given enough time.
That seems like something worth working on for a larger chunk of people than those currently involved.
We don’t have enough time, and by the time the relevant amount of time has passed, ai will have blasted genetic augmentation into a new era. Existential AI alignment is necessary to do any significant amount of genetic modding.
I disagree. I could do a moderate but substantial amount of human genetic engineering right now, if I had more resources and if the police wouldn’t arrest me. AI is not required for this.
Can we do genetic engineering that is immediately useful, as opposed to “at a minimum wait ~ 10 years for an infant to become Ender Wiggin?”
Given the responses to a similar question, I think the answer is no, that is I would expect basically no genetic editing/IVF breakthroughs to transfer to the somatic cells.
No, probably not. But I think it’s still a good idea that most people are ignoring.
I think that’s a fine position, but doesn’t seem to be addressing’ gears’ point. (“We don’t know for sure how much time we have and this seems like a thing that’s worth working on” seems like a fine answer though)
One piece of advice/strategy I’ve received that’s in this vein is “maximize return on failure”. So prefer to fail in ways that you learn a lot, and to fail quickly, cheaply, and conclusively, and produce positive externalities from failure. This is not so much a good search strategy but a good guiding principle and selection heuristic.
https://twitter.com/carmenleelau/status/1593354133146402816 is another recent formulation of ~the same idea.
Meta note: I strongly dislike Twitter and wish that people would just copy the raw text they want to share instead of a link.
Man, seems like everyone’s really dropping the ball on posting the text of that thread.
(There’s an image of an exasperated looking Miyazaki)
Not that I’ve paid much attention, but has anyone checked him for iron overload? Insulin resistance? Thyroid?
Relatedly on “obviously dropping the ball”: has Eliezer tried harder prescription stimulants? With his P(doom) & timelines, there’s relatively little downside to this done in reasonable quantities I think. They can be prescribed. Seems extremely likely to help with fatigue
From what I’ve read, the main warning would be to get harder blocks on whatever sidetracks eliezer (e.g. use friends to limit access, have a child lock given to a trusted person, etc)
Seems like this hasn’t been tried much beyond a basic level, and I’m really curious why not given high Eliezer/Nate P(doom)s. There are several famously productive researchers who did this
I find this post very encouraging, but I can’t shake a particular concern about the approach that it recommends.
From extrapolating past experiences, it seems like every time I try (or even succeed) at something ambitious, I soon find that somebody else already did that thing, or proved why that thing can’t work, and they did it better than I would have unless I put in ten times as much effort as I did. In other words, I struggle to know what’s already been done.
I notice that this happens a lot less often with mathematics than it used to. Perhaps part of it is that I became less ambitious, but I also think that part of it was formal education. (I finished a BS in math a few years ago.) I do think one of the major benefits of formal education is that it gives the student a map of the domain they’re interested in, so that they can find their way to the boundary with minimal wasted effort.
Thank you so much for this effectiveness focused post. I thought I would add another perspective, namely “against the lone wolf” approach, i.e. that AI-safety will come down to one person, or a few persons, or an elite group of engineers somewhere. I agree for now there are some individuals who are doing more conceptual AI-framing than others, but in my view I am “shocked that everyone’s dropping the ball” by putting up walls and saying that general public is not helpful. Yes, they might not be helpful now, but we need to work on this!… Maybe someone with the right skill will come along :)
I also view academia as almost hopeless (it’s where I work). But it feels that if a few of us can get some stable jobs/positions/funding—we can start being politically active within academia and the return on investment there could be tremendous.
Thank you for this post. I had an idea for how to work on alignment that seemed obvious to me, but wasn’t sure if it would pan out. Now I’ll go write up my probably wrong idea :)
Maybe I don’t know what I’m talking about and obviously we’ve tried this already.
I’ve heard Eliezer mention that the ability to understand AI risk is linked to Security Mindset.
Security Mindset is basically: you can think like a hacker, of exploits, how to abuse rules etc. So you can defend against hacks & exploits. You don’t stop at basic “looks safe to me!”
There are a lot of examples of this Security/Hacker Mindset in HPMOR. When Harry learns of rates between magical coins vs his known prices for gold, silver, etc, he instantly thinks of a scheme to trade between the magical and the muggle world to make infinite money.
Eliezer also said that Security Mindset is something you either got or not.
I remember thinking: that can’t be true!
Are we bottlenecking AI alignment on not having enough people with Eliezer-level Security Mindset, and saying “Oh well, it can’t be taught!”?!
(That’s where I’ve had the “people are dropping the ball” feeling. But maybe I just don’t know enough.)
Two things seem obvious to me:
- Couldn’t one devise a Security Mindset test, and get the high scorers to work on alignment?
(So even if we can’t teach it, we get more people who have it)(I assume it was a similar process to find Superforecasters).
- Have we already tried really hard to teach Security Mindset, so that we’re sure it can’t be taught?
Presumably, Eliezer did try, and concluded it wasn’t teachable?
I won’t be the one doing this, since I’m unclear on whether I’m Security gifted myself (I think a little, and I think more than I used to, but I’m too low g to play high level games).
Mm, that’s not exactly how I’d summarize it. That seems more like ordinary paranoia:
My understanding is that Security Mindset-style thinking doesn’t actually rest on your ability to invent a workable plan of attack. Instead, it’s more like imagining that there exists a method for unstoppably breaking some (randomly-chosen) element of your security, and then figuring out how to make your system secure despite that. Or… that it’s something like the opposite of fence-post security, where you’re trying to make sure that for your system to be broken, several conditionally independent things need to go wrong or be wrong.
Ok, thanks for the correction! My definition was wrong but the argument still stands that it should be teachable, or at least testable.