It’s a fair point that wisdom might not be straightforwardly safety-increasing. If someone wanted to explore e.g. assumptions/circumstances under which it is vs isn’t, that would certainly be within scope for the competition.
owencb(Owen Cotton-Barratt)
Multiple entries are very welcome!
[With some kind of anti-munchkin caveat. Submitting your analyses of several different disjoint questions seems great; submitting two versions of largely the same basic content in different styles not so great. I’m not sure exactly how we’d handle it if someone did the latter, but we’d aim for something sensible that didn’t incentivise people to have been silly about it.]
Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes
Thanks, yes, I think that you’re looking at things essentially the same way that I am. I particularly like your exploration of what the inner motions feel like; I think “unfixation” is a really good word.
I think that for most of what I’m saying, the meaning wouldn’t change too much if you replaced the word “wholesome” with “virtuous” (though the section contrasting it with virtue ethics would become more confusing to read).
As practical guidance, however, I’m deliberately piggybacking off what people already know about the words. I think the advice to make sure that you pay attention to ways in which things feel unwholesome is importantly different from (and, I hypothesize, more useful than) advice to make sure you pay attention to ways in which things feel unvirtuous. And the advice to make sure you pay attention to things which feel frobby would obviously not be very helpful, since readers will not have much of a sense of what feels frobby.
AI strategy given the need for good reflection
If you personally believe it to be wrong, it’s unwholesome. But generically no. See the section on revolutionary action in the third essay.
I think this is essentially correct. The essays (especially the later ones) do contain some claims about ways in which it might or might not be useful; of course I’m very interested to hear counter-arguments or further considerations.
Beyond Maxipok — good reflective governance as a target for action
The most straightforward criterion would probably be “things they themselves feel to be mistakes a year or two later”. That risks people just failing to own their mistakes so would only work with people I felt enough trust in to be honest with themselves. Alternatively you could have an impartial judge. (I’d rather defer to “someone reasonable making judgements” than try to define exactly what a mistake is, because the latter would cover a lot of ground and I don’t think I’d do a good job of it; also my claims don’t feel super sensitive to how mistakes are defined.)
I would certainly update in the direction of “this is wrong” if I heard a bunch of people had tried to apply this style of thinking over an extended period, I got to audit it a bit by chatting to them and it seemed like they were doing a fair job, and the outcome was they made just as many/serious mistakes as before (or worse!).
(That’s not super practically testable, but it’s something. In fact I’ll probably end up updating some from smaller anecdata than that.)
I definitely agree that this fails as a complete formula for assessing what’s good or bad. My feeling is that it offers an orientation that can be helpful for people aggregating stuff they think into all-things-considered judgements (and e.g. I would in retrospect have preferred to have had more of this orientation in the past).
If someone were using this framework to stop thinking about things that I thought they ought to consider, I couldn’t be confident that they weren’t making a good faith effort to act wholesomely, but I at least would think that their actions weren’t wholesome by my lights.
Good question, my answer on this is nuanced (and I’m kind of thinking it through in response to your question).
I think that what feels to you to be wholesome will depend on your values. And I’m generally in favour of people acting according to their own feeling of what is wholesome.
On the other hand I also think there would be some choices of values that I would describe as “not wholesome”. These are the ones which ignore something of what’s important about some dimension (perhaps justifying ignoring it by saying “I just don’t value this”), at least as felt-to-be-important by a good number of other people in society.
But although “avoiding unwholesomeness” provides some constraints on values, it’s not specifying exactly what values or tradeoffs are good to have. And then for any among the range of possible wholesome values, when you come to make decisions acting wholesomely will depend on your personal values. (Or, depending on the situation, perhaps not; in the case of the business plan, if it’s supposed to be for the sake of the local community then what is wholesome could depend a lot more on the community’s values than on your own.)
So there is an element of “paying at least some attention to traditional values” (at least while fair numbers of people care about them), but it’s definitely not trying to say “optimize for them”.
I doubt this is very helpful for our carefully-considered ethical notions of what’s good.
I think it may be helpful as a heuristic for helping people to more consistently track what’s good, and avoid making what they’d later regard as mistakes.
I agree that “paying attention to the whole system” isn’t literally a thing that can be done, and I should have been clearer about what I actually meant. It’s more like “making an earnest attempt to pay attention to the whole system (while truncating attention at a reasonable point)”. It’s not that you literally get to attend to everything, it’s that you haven’t excluded some important domain from things you care about. I think habryka (quoting and expanding on Ben Pace’s thoughts) has a reasonable description of this in a comment.
I definitely don’t think this is just making an arbitrary choice of what things to value, or that it’s especially anchored in traditional values (though I do think it’s correlated with traditional values).
I discuss a bit about making the tradeoffs of when to stop giving things attention in the section “wholesomeness vs expedience” in the second essay.
I think that there is some important unwholesomeness in these things, but that isn’t supposed to mean that they’re never permitted. (Sorry, I see how it could give that impression; but in the cases you’re discussing there would often be greater unwholesomeness in not doing something.)
I discuss how I think my notion of wholesomeness intersects with these kind of examples in the section on visionary thought and revolutionary action in the third essay.
I think that there’s something interesting here. One of the people I talked about this with asked me why children seem exceptionally wholesome (it’s certainly not because they’re unusually good at tracking the whole of things), and I thought the answer was about them being a part of the world where it may be especially important to avoid doing accidental harm, so our feelings of harms-to-children have an increased sense of unwholesomeness. But I’m now thinking that something like “robustly not evil” may be an important part of it.
Now we can trace out some of the links between wholesomeness1 and wholesomeness2. If evil is something like “consciously disregarding the impacts of your actions on (certain) others”, then wholesomeness1 should robustly avoid it. And failures of wholesomeness1 which aren’t evil might still be failures of wholesomeness2 -- because they involve a failure to attend to some impacts of actions, while observers may not be able to tell whether that failure to attend was accidental or deliberate.
A couple more notes:
I don’t think that wholesomeness2 is a crisp thing—it’s dependent on the audience, and how much they get to observe. Someone could have wholesomeness2 in a strong way with respect to one audience, and really not with respect to another audience.
I think in expectation / in the long run / as your audiences get smarter (or something), pursuing wholesomeness1 may be a good proxy for wholesomeness2. Basically for the kind of reasons discussed in Integrity for consequentialists
FWIW I quite like your way of pointing at things here, though maybe I’m more inclined towards letting things hang out for a while in the (conflationary?) alliance space to see which seem to be the deepest angles of what’s going on in this vicinity, and doing more of the conceptual analysis a little later.
That said, if someone wanted to suggest a rewrite I’d seriously consider adopting it (or using it as a jumping-off point); I just don’t think that I’m yet at the place where a rewrite will flow naturally for me.
I largely think that the section of the second essay on “wholesomeness vs expedience” is also applicable here.
Basically I agree that you sometimes have to not look at things, and I like your framing of the hard question of wholesomeness. I think that the full art of deciding when it’s appropriate to not think about something be better discussed via a bunch of examples, rather than trying to describe it in generalities. But the individual decisions are ones that you can make wholesomely or not, and I think that’s my current best guess approach for how to handle this. Setting something aside, when it feels right to do so, with some sadness that you don’t get to get to the bottom of it, feels wholesome. Blithely dismissing something as not worth attention typically feels unwholesome, because of something like a missing mood (and relatedly, it not being clear that you’re attending enough to notice if it were worth more attention).
There’s also a question about how this relates to social reality. I think that if you’re choosing not to look at something because it doesn’t feel like it’s worth the attention, then if someone else raises it (because it seems important to them) it’s natural to engage with some curiosity that you now—for the space of the conversation—get to look at the thing a bit. You may explain why you don’t normally think about it, but you’re not actively trying to suppress it. I think the more unwholesome versions of not looking at something are much more likely to try to actively avoid or shut the conversation down.
I think FHI was an extremely special place and I was privileged to get to spend time there.
I applaud attempts to continue its legacy. However, I’d feel gut-level more optimistic about plans that feel more grounded in thinking about how circumstances are different now, and then attempting to create the thing that is live and good given that, relative to attempting to copy FHI as closely as possible.
Differences in circumstance
You mention not getting to lean on Bostrom’s research taste as one driver of differences, and I think this is correct but may be worth tracing out the implications of even at the stage of early planning. Other things that seem salient and important to me:
For years, FHI was one of the only places in the world that you could seriously discuss many of these topics
There are now much bigger communities and other institutions where these topics are at least culturally permissible (and some of them, e.g. AI safety, are the subject of very active work)
This means that:
One of FHI’s purposes was serving a crucial niche which is now less undersupplied
FHI benefited from being the obvious Schelling location to go to think about these topics
Whereas even in Berkeley you want to think a bit about how you sit in the ecosystem relative to Constellation (which I think has some important FHI-like virtues, although makes different tradeoffs and misses on others)
FHI benefited from the respectability of being part of the University
In terms of getting outsiders to take it seriously, getting meetings with interesting people, etc.
I’m not saying this was crucial for its success, and in any case the world looks different now; but I think it had some real impact and is worth bearing in mind
As you mention—you have a campus!
I think it would be strange if this didn’t have some impact on the shape of plans that would be optimal for you
Pathways to greatness
If I had to guess about the shape of plans that I think you might engage in that would lead to something deserving of the name “FHI of the West”, they’re less like “poll LessWrong for interest to discover if there’s critical mass” (because I think that whether there’s critical mass depends a lot on people’s perceptions of what’s there already, and because many of the people you might most want probably don’t regularly read LessWrong), and more like thinking about pathways to scale gracefully while building momentum and support.
When I think about this, two ideas that seem to me like they’d make the plan more promising (that you could adopt separately or in conjunction) are (1) starting by finding research leads, and/or (2) starting small as-a-proportion-of-time. I’ll elaborate on these:
Finding research leads
I think that Bostrom’s taste was extremely important for FHI. There are a couple of levels this was true on:
Cutting through unimportant stuff in seminars
I think it’s very easy for people, in research, to get fixated on things that don’t really matter. Sometimes this is just about not asking enough which the really important questions are (or not being good enough at answering that); sometimes it’s kind of performative, about people trying to show off how cool their work is
Nick had low tolerance for this, as well as excellent taste. He wasn’t afraid to be a bit disagreeable in trying to get to the heart of things
This had a number of benefits:
Helping discussions in seminars to be well-focused
Teaching people (by example) how to do the cut-through-the-crap move
Shaping incentives for researchers in the institute, towards tackling the important questions head on
Gatekeeping access to the space
Bostrom was good at selecting people who would really contribute in this environment
This wasn’t always the people who were keenest to be there; and saying “no” to people who would add a little bit but not enough (and dilute things) was probably quite important
In some cases this meant finding outsiders (e.g. professors elsewhere) to visit, and keeping things intellectually vibrant by having discussions with people with a wide range of current interests and expertise, rather than have FHI just become an echo chamber
Being a beacon
Nick had a lot of good ideas, which meant that people were interested to come and talk to him, or give seminars, etc.
If you want something to really thrive, at some point you’re going to have to wrestle with who is providing these functions. I think that one thing you could do is to start with this piece. Rather than think about “who are all the people who might be part of this? does that sound like critical mass?”, start by asking “who are the people who could be providing these core functions?”. I’d guess if you brainstorm names you’ll end up with like 10-30 that might be viable (if they were interested). Then I’d think about trying to approach them to see if you can persuade one or more to play this role. (For one thing, I think this could easily end up with people saying “yes” who wouldn’t express interest on the current post, and that could help you in forming a strong nucleus.)
I say “leads” rather than “lead” because it seems to me decently likely that you’re best aiming to have these responsibilities be shared over a small fellowship. (I’m not confident in this.)
Your answer might also be “I, Oliver, will play this role”. My gut take would be excited for you to be like one of three people in this role (with strong co-leads, who are maybe complementary in the sense that they’re strong at some styles of thinking you don’t know exactly how to replicate), and kind of weakly pessimistic about you doing it alone. (It certainly might be that that pessimism is misplaced.)
Starting small as-a-proportion-of-time
Generally, things start a bit small, and then scale up. People can be reluctant to make a large change in life circumstance (like moving job or even city) for something where it’s unknown what the thing they’re joining even is. By starting small you get to iron out kinks and then move on from there.
Given that you have the campus, I’d seriously consider starting small not as-a-number-of-people but as-a-proportion-of-time. You might not have the convening power to get a bunch of great people to make this their full time job right now (especially if they don’t have a good sense who their colleagues will be etc.). But you probably do have the convening power to get a bunch of great people to show up for a week or two, talk through big issues, and spark collaborations.
I think that you could run some events like this. Maybe to start they’re just kind of like conferences / workshops, with a certain focus. (I’d still start by trying to find something like “research leads” for the specific events, as I think it would help convening power as well as helping the things to go well.) In some sense that might be enough for carrying forward the spirit of FHI—it’s important that there are spaces for it, not that these spaces are open 365. But if it goes well and they seem productive, it could be expanded. Rather than just “research weeks”, offer “visiting fellowships” where people take a (well-paid) 1-3 month sabbatical from their regular job to come and be in person all at the same time. And then if that’s going well consider expanding to a permanent research group. (Or not! Perhaps the ephemeral nature of short-term things, and constantly having new people, would prove even more productive.)