I kind of want you to get quantitative here? Like pretty much every action we take has some effect on AI timelines, but I think effect-on-AI-timelines is often swamped by other considerations (like effects on attitudes around those who will be developing AI).
Of course it’s prima facie more plausible that the most important effect of AI research is the effect on timelines, but I’m actually still kind of sceptical. On my picture, I think a key variable is the length of time between when-we-understand-the-basic-shape-of-things-that-will-get-to-AGI and when-it-reaches-strong-superintelligence. Each doubling of that length of time feels to me like it could be worth order of 0.5-1% of the future. Keeping implemented-systems close to the technological-frontier-of-what’s-possible could help with this, and may be more affectable than the
Note that I don’t think this really factors into an argument in terms of “advancing alignment” vs “aligning capabilities” (I agree that if “alignment” is understood abstractly the work usually doesn’t add too much to that). It’s more like a DTD argument about different types of advancing capabilities.
I think it’s unfortunate if that strategy looks actively bad on your worldview. But if you want to persuade people not to do it, I think you either need to persuade them of the whole case for your worldview (for which I’ve appreciated your discussion of the sharp left turn), or to explain not just that you think this is bad, but also how big a deal do you think it is. Is this something your model cares about enough to trade for in some kind of grand inter-worldview bargaining? I’m not sure. I kind of think it shouldn’t be (that relative to the size of ask it is, you’d get a much bigger benefit from someone starting to work on things you cared about than stopping this type of capabilities research), but I think it’s pretty likely I couldn’t pass your ITT here.
I think FHI was an extremely special place and I was privileged to get to spend time there.
I applaud attempts to continue its legacy. However, I’d feel gut-level more optimistic about plans that feel more grounded in thinking about how circumstances are different now, and then attempting to create the thing that is live and good given that, relative to attempting to copy FHI as closely as possible.
Differences in circumstance
You mention not getting to lean on Bostrom’s research taste as one driver of differences, and I think this is correct but may be worth tracing out the implications of even at the stage of early planning. Other things that seem salient and important to me:
For years, FHI was one of the only places in the world that you could seriously discuss many of these topics
There are now much bigger communities and other institutions where these topics are at least culturally permissible (and some of them, e.g. AI safety, are the subject of very active work)
This means that:
One of FHI’s purposes was serving a crucial niche which is now less undersupplied
FHI benefited from being the obvious Schelling location to go to think about these topics
Whereas even in Berkeley you want to think a bit about how you sit in the ecosystem relative to Constellation (which I think has some important FHI-like virtues, although makes different tradeoffs and misses on others)
FHI benefited from the respectability of being part of the University
In terms of getting outsiders to take it seriously, getting meetings with interesting people, etc.
I’m not saying this was crucial for its success, and in any case the world looks different now; but I think it had some real impact and is worth bearing in mind
As you mention—you have a campus!
I think it would be strange if this didn’t have some impact on the shape of plans that would be optimal for you
Pathways to greatness
If I had to guess about the shape of plans that I think you might engage in that would lead to something deserving of the name “FHI of the West”, they’re less like “poll LessWrong for interest to discover if there’s critical mass” (because I think that whether there’s critical mass depends a lot on people’s perceptions of what’s there already, and because many of the people you might most want probably don’t regularly read LessWrong), and more like thinking about pathways to scale gracefully while building momentum and support.
When I think about this, two ideas that seem to me like they’d make the plan more promising (that you could adopt separately or in conjunction) are (1) starting by finding research leads, and/or (2) starting small as-a-proportion-of-time. I’ll elaborate on these:
Finding research leads
I think that Bostrom’s taste was extremely important for FHI. There are a couple of levels this was true on:
Cutting through unimportant stuff in seminars
I think it’s very easy for people, in research, to get fixated on things that don’t really matter. Sometimes this is just about not asking enough which the really important questions are (or not being good enough at answering that); sometimes it’s kind of performative, about people trying to show off how cool their work is
Nick had low tolerance for this, as well as excellent taste. He wasn’t afraid to be a bit disagreeable in trying to get to the heart of things
This had a number of benefits:
Helping discussions in seminars to be well-focused
Teaching people (by example) how to do the cut-through-the-crap move
Shaping incentives for researchers in the institute, towards tackling the important questions head on
Gatekeeping access to the space
Bostrom was good at selecting people who would really contribute in this environment
This wasn’t always the people who were keenest to be there; and saying “no” to people who would add a little bit but not enough (and dilute things) was probably quite important
In some cases this meant finding outsiders (e.g. professors elsewhere) to visit, and keeping things intellectually vibrant by having discussions with people with a wide range of current interests and expertise, rather than have FHI just become an echo chamber
Being a beacon
Nick had a lot of good ideas, which meant that people were interested to come and talk to him, or give seminars, etc.
If you want something to really thrive, at some point you’re going to have to wrestle with who is providing these functions. I think that one thing you could do is to start with this piece. Rather than think about “who are all the people who might be part of this? does that sound like critical mass?”, start by asking “who are the people who could be providing these core functions?”. I’d guess if you brainstorm names you’ll end up with like 10-30 that might be viable (if they were interested). Then I’d think about trying to approach them to see if you can persuade one or more to play this role. (For one thing, I think this could easily end up with people saying “yes” who wouldn’t express interest on the current post, and that could help you in forming a strong nucleus.)
I say “leads” rather than “lead” because it seems to me decently likely that you’re best aiming to have these responsibilities be shared over a small fellowship. (I’m not confident in this.)
Your answer might also be “I, Oliver, will play this role”. My gut take would be excited for you to be like one of three people in this role (with strong co-leads, who are maybe complementary in the sense that they’re strong at some styles of thinking you don’t know exactly how to replicate), and kind of weakly pessimistic about you doing it alone. (It certainly might be that that pessimism is misplaced.)
Starting small as-a-proportion-of-time
Generally, things start a bit small, and then scale up. People can be reluctant to make a large change in life circumstance (like moving job or even city) for something where it’s unknown what the thing they’re joining even is. By starting small you get to iron out kinks and then move on from there.
Given that you have the campus, I’d seriously consider starting small not as-a-number-of-people but as-a-proportion-of-time. You might not have the convening power to get a bunch of great people to make this their full time job right now (especially if they don’t have a good sense who their colleagues will be etc.). But you probably do have the convening power to get a bunch of great people to show up for a week or two, talk through big issues, and spark collaborations.
I think that you could run some events like this. Maybe to start they’re just kind of like conferences / workshops, with a certain focus. (I’d still start by trying to find something like “research leads” for the specific events, as I think it would help convening power as well as helping the things to go well.) In some sense that might be enough for carrying forward the spirit of FHI—it’s important that there are spaces for it, not that these spaces are open 365. But if it goes well and they seem productive, it could be expanded. Rather than just “research weeks”, offer “visiting fellowships” where people take a (well-paid) 1-3 month sabbatical from their regular job to come and be in person all at the same time. And then if that’s going well consider expanding to a permanent research group. (Or not! Perhaps the ephemeral nature of short-term things, and constantly having new people, would prove even more productive.)