My model is that CFAR is doing the same activity it was always doing, which one may or may not want to call “research”.
I’ll describe that activity here. I think it is via this core activity (plus accidental drift, or accidental hill-climbing in response to local feedbacks) that we have generated both our explicit curriculum, and a lot of the culture around here.
Components of this core activity (in no particular order):
We try to teach specific skills to specific people, when we think those skills can help them. (E.g. goal-factoring; murphyjitsu; calibration training on occasion; etc.)
We keep our eyes open while we do #1. We try to notice whether the skill does/doesn’t match the student’s needs. (E.g., is this so-called “skill” actually making them worse at something that we or they can see? Is there a feeling of non-fit suggesting something like that? What’s actually happening as the “skill” gets “learned”?)
We call this noticing activity “seeking PCK” and spend a bunch of time developing it in our mentors and instructors.
We try to stay in touch with some of our alumni after the workshop, and to notice what the long-term impacts seem to be (are they actually practicing our so-called “skills”? Does it help when they do? More broadly, what changes do we just-happen-by-coincidence to see in multiple alumni again and again, and are these positive or negative changes, and what might be causing them?
In part, we do this via the four follow-up calls that participants receive after they attend the mainline workshop; in part we do it through the alumni reunions, the kind of contact that comes naturally from being in the same communities, etc.
We often describe some of what we think we’re seeing, and speculate about where to go given that, in CFAR’s internal colloqium.
We pay particular attention to alumni who are grappling with existential risk or EA, partly because it seems to pose distinct difficulties that it would be nice if someone found solutions to.
Spend a bunch of time with people who are succeeding at technical AI safety work, trying to understand what skills go into that. Spend a bunch of time with people who are training to do technical AI safety work (often at the same time that people who can actually do such work are there), working to help transfer useful mindset (while trying also to pay attention to what’s happening.
Right now we do this mostly at the AIRCS and MSFP workshops.
Spend a bunch of time engaging smart new people to see what skills/mindsets they would add to the curriculum, so we don’t get too stuck in a local optimum.
What this looks like recently:
The instructor training workshops are helping us with this. Many of us found those workshops pretty generative, and are excited about the technique-seeds and cultural content that the new instructor candidates have been bringing.
The AIRCS program has also been bringing in highly skilled computer scientists, often from outside the rationality and EA community. My own thinking has changed a good bit in contact with the AIRCS experience. (They are less explicitly articulate about curriculum than the instructor candidates; but they ask good questions, buy some pieces of our content, get wigged out by other pieces of our content in a non-random manner, answer follow-up questions in contact with that that sometimes reveal implicit causal models of how to think that seem correct to me, etc. And so they are a major force for AIRCS curriculum generation in that way.)
Gaps in 5:
I do wish we had better contact with more and varied highly productive thinkers/markers of different sorts, as a feed-in to our curriculum. We unfortunately have no specific plans to fix this gap in 2020 (and I don’t think it could fit without displacing some even-more-important planned shift—we have limited total attention); but it would be good to do sometime over the next five years. I keep dreaming of a “writers’ workshop” and an “artist’s workshop” and so on, aimed at seeing how our rationality stuff mutates when it hits people with different kinds of visibly-non-made-up productive skill.
We sometimes realize that huge swaths of our curriculum are having unwanted effects and try to change it. We sometimes realize that our model of “the style of thinking we teach” is out of touch with our best guesses about what’s good, and try to change it.
We try to study any functional cultures that we see (e.g., particular functional computer science communities; particular communities found in history books), to figure out what magic was there. We discuss this informally and with friends and with e.g. the instructor candidates.
We try to figure out how thinking ever correlates with the world, and when different techniques make this better and worse in different context. And we read the Sequences to remember that this is what we’re doing.
We could stand to do this one more; increasing this is a core planned shift for 2020. But we’ve always done it some, including over the last few years.
The “core activity” exemplified in the above list is, of course, not RCT-style verifiable track records-y social science (which is one common meaning of “research”). There is a lot of merit to that verifiable social science, but also a lot of slowness to it, and I cannot imagine using it to design the details of a curriculum, although I can imagine using it to see whether a curriculum has particular high-level effects.
We also still do some (but not as much as we wish we could do) actual data-tracking, and have plans to do modestly more of it over the coming year. I expect this planned modest increase will be useful for our broader orientation but not much of a direct feed-in into curriculum, although it might help us tweak certain knobs upward or downward a little.
My model is that CFAR is doing the same activity it was always doing, which one may or may not want to call “research”.
I’ll describe that activity here. I think it is via this core activity (plus accidental drift, or accidental hill-climbing in response to local feedbacks) that we have generated both our explicit curriculum, and a lot of the culture around here.
Components of this core activity (in no particular order):
We try to teach specific skills to specific people, when we think those skills can help them. (E.g. goal-factoring; murphyjitsu; calibration training on occasion; etc.)
We keep our eyes open while we do #1. We try to notice whether the skill does/doesn’t match the student’s needs. (E.g., is this so-called “skill” actually making them worse at something that we or they can see? Is there a feeling of non-fit suggesting something like that? What’s actually happening as the “skill” gets “learned”?)
We call this noticing activity “seeking PCK” and spend a bunch of time developing it in our mentors and instructors.
We try to stay in touch with some of our alumni after the workshop, and to notice what the long-term impacts seem to be (are they actually practicing our so-called “skills”? Does it help when they do? More broadly, what changes do we just-happen-by-coincidence to see in multiple alumni again and again, and are these positive or negative changes, and what might be causing them?
In part, we do this via the four follow-up calls that participants receive after they attend the mainline workshop; in part we do it through the alumni reunions, the kind of contact that comes naturally from being in the same communities, etc.
We often describe some of what we think we’re seeing, and speculate about where to go given that, in CFAR’s internal colloqium.
We pay particular attention to alumni who are grappling with existential risk or EA, partly because it seems to pose distinct difficulties that it would be nice if someone found solutions to.
Spend a bunch of time with people who are succeeding at technical AI safety work, trying to understand what skills go into that. Spend a bunch of time with people who are training to do technical AI safety work (often at the same time that people who can actually do such work are there), working to help transfer useful mindset (while trying also to pay attention to what’s happening.
Right now we do this mostly at the AIRCS and MSFP workshops.
Spend a bunch of time engaging smart new people to see what skills/mindsets they would add to the curriculum, so we don’t get too stuck in a local optimum.
What this looks like recently:
The instructor training workshops are helping us with this. Many of us found those workshops pretty generative, and are excited about the technique-seeds and cultural content that the new instructor candidates have been bringing.
The AIRCS program has also been bringing in highly skilled computer scientists, often from outside the rationality and EA community. My own thinking has changed a good bit in contact with the AIRCS experience. (They are less explicitly articulate about curriculum than the instructor candidates; but they ask good questions, buy some pieces of our content, get wigged out by other pieces of our content in a non-random manner, answer follow-up questions in contact with that that sometimes reveal implicit causal models of how to think that seem correct to me, etc. And so they are a major force for AIRCS curriculum generation in that way.)
Gaps in 5:
I do wish we had better contact with more and varied highly productive thinkers/markers of different sorts, as a feed-in to our curriculum. We unfortunately have no specific plans to fix this gap in 2020 (and I don’t think it could fit without displacing some even-more-important planned shift—we have limited total attention); but it would be good to do sometime over the next five years. I keep dreaming of a “writers’ workshop” and an “artist’s workshop” and so on, aimed at seeing how our rationality stuff mutates when it hits people with different kinds of visibly-non-made-up productive skill.
We sometimes realize that huge swaths of our curriculum are having unwanted effects and try to change it. We sometimes realize that our model of “the style of thinking we teach” is out of touch with our best guesses about what’s good, and try to change it.
We try to study any functional cultures that we see (e.g., particular functional computer science communities; particular communities found in history books), to figure out what magic was there. We discuss this informally and with friends and with e.g. the instructor candidates.
We try to figure out how thinking ever correlates with the world, and when different techniques make this better and worse in different context. And we read the Sequences to remember that this is what we’re doing.
We could stand to do this one more; increasing this is a core planned shift for 2020. But we’ve always done it some, including over the last few years.
The “core activity” exemplified in the above list is, of course, not RCT-style verifiable track records-y social science (which is one common meaning of “research”). There is a lot of merit to that verifiable social science, but also a lot of slowness to it, and I cannot imagine using it to design the details of a curriculum, although I can imagine using it to see whether a curriculum has particular high-level effects.
We also still do some (but not as much as we wish we could do) actual data-tracking, and have plans to do modestly more of it over the coming year. I expect this planned modest increase will be useful for our broader orientation but not much of a direct feed-in into curriculum, although it might help us tweak certain knobs upward or downward a little.