CFAR was founded with a two-part vision: to develop and train people in the art of rationality, and to have some of those people then be more likely to work effectively on the world’s most important problems. Last year, we decided to focus especially on our alums’ impact on the problem of AI risk.
The question of how well CFAR is succeeding at its mission can be broken down into two subquestions:
- What effect does CFAR have on people?
- What effect do those people have on the world, and in particular on reducing AI risk, that they wouldn’t otherwise have had were it not for their interaction with CFAR?
This post describes how we have been thinking about the second of these two questions, which we have been especially focused on over the past year.
A relatively straightforward way to approach this question is to look at what our alumni are up to, as we have done informally throughout CFAR’s history. However, this still leaves tricky questions to answer: how to assess the size of their impact on the world, how to guess at the counterfactual of what they would have been doing without CFAR’s influence, and how to pinpoint which aspects of their involvement with CFAR made a difference.
Over the past year and a half we have tried to look more systematically at what our alumni are up to, and to put more effort into investigating these tricky questions. We have done this, as described in more detail below, by evaluating the results of an impact survey sent to all CFAR alumni, and by conducting interviews with some alums who seem to us to be doing especially high-impact work.
Alumni Impact Survey
In May 2016, we set out to count the number of alumni who have had an increase in expected impact due to their involvement with CFAR by sending out a survey to our alumni. This survey asked them whether or not they thought their positive impact on the world had increased as a result of their interactions with CFAR, and (if so) what they were doing differently and how CFAR had affected the change. In June 2017 we sent out an updated version of the survey which had more prompts for text responses.
For each person’s responses, we manually coded whether it seemed like 1) their current path was high-impact, 2) their current path was substantially better than their old path, and 3) CFAR played a significant role in this change. We counted someone as having an “increase in expected impact” (IEI) if they met all three criteria.
On the first criterion, a path could count as “high-impact” based on expected future impact rather than past impact, as long as the person seemed to be taking concrete steps towards having an impact. For donations, the minimum bar for what counted as “substantial” or “significant” was $5,000 to effective charities (or plans to give 5% of their income). On the first criterion we looked at whether they had given $5,000, on the second criterion at whether they had given an additional $5,000, and on the third criterion at whether they had given $5,000 more than they would have in expectation without interacting with CFAR.
A few additional alums were coded as having an IEI after conversations with them confirmed that they met the criteria.
Out of the 894 CFAR alumni that we had as of June 2017, we identified 159 who had an IEI. We coded the primary type of work for each person into one of the following categories:
- 19 - EA/AI Organizations (excluding MIRI and CFAR)
- People who have worked with an organization such as CEA, 80k, Open Phil, GiveWell, DeepMind, OpenAI, or Effective Altruism Foundation
- 15 - MIRI
- People who have worked at or collaborated with MIRI
- 13 - CFAR/ESPR/SPARC
- People who have worked with CFAR, ESPR, or SPARC. (This includes CFAR participants who were later hired by CFAR, but does not include anyone who worked at CFAR before attending a CFAR program).
- 28 - Technical AI Safety Career Path
- People who have made concrete steps toward pursuing a career in technical AI safety, but who do not yet have a job in the field or a track record of relevant work
- 27 - EA Career Path
- People who have done EA-aligned work, or who are on a path to do EA-aligned work, which 1) is not technical AI safety work, and 2) does not primarily involve a current or past job at an EA organization
- 17 - Group Leaders
- People who have led, or contributed significantly to, EA/rationality/AI meetup groups or events
- 32 - Donations
- People who have donated money to EA organizations or causes
- 8 - Other
- People who are contributing in other ways
We also collected data on several predictor variables. Here are the strongest predictors of which CFAR participants had an IEI:
- Attended alumni workshops or volunteered with CFAR
- 35% of the IEI group has either volunteered for CFAR or attended an alumni workshop since April 2015, compared with only 7% of the non-IEI group that took the survey
- Moved to the San Francisco Bay Area
- 29% of the IEI group reported that they moved to the Bay Area due to CFAR vs. 7% of the non-IEI group
- Previously involved in EA/rationality community
- 32% of people who we noted as being involved with the EA/rationality community before attending CFAR had an IEI vs. 14% of people who were not involved with the community
Size of Impact
Size of impact is very important, and is also difficult to assess (especially for people doing direct work). We made a first pass at rating the size of each IEI on a 0.1-10 scale (as 80,000 Hours does), and found that this improved our understanding a little, but not as much as we would have liked. Impact seems to differ by more than two orders of magnitude (e.g., compare a person who founded and runs a nonprofit with someone who donated $5,000 to that nonprofit). The question of whether to rate a person’s impact as “10” or “80” or “200” (on an uncapped version of the scale) would have a large effect on the total number and is also hard to answer (especially with the limited amount of information collected by the survey).
A few things that we found with the 0.1-10 scale for the size of IEI:
People whose primary area of impact was through working with EA/AI Organizations or MIRI had the largest average size of IEI, while people whose primary area of impact was Donations or Other had the smallest average size of IEI
The three predictors of an IEI noted above (attending alumni events or volunteering with CFAR, moving to the SF Bay Area, and previous involvement with the EA/rationality community) continued to be strong predictors when looking at weighted IEI (which ranges from 0.1 for a small increase in expected impact to 10 for a very large increase in expected impact)
When we look at weighted IEI, special AI-focused programs led to a much larger increase in impact than mainline workshops (roughly 3.3 times as much). These programs are the MIRI Summer Fellows Program (MSFP; the 2017 version was called AISFP), Workshop on AI Safety Strategy (WAISS), or CFAR for Machine Learning researchers (CML). We did not have an easy way to determine which portion of a given person’s IEI was caused by which program they attended, so we ran the analysis in 2 different ways which give estimates of 3.7x and 2.9x (which average to 3.3x):
- Counting each time that a person attended a program as one data point (so that a single person would get counted 4 times if they attended 4 programs), and giving each instance of attending full credit for that person’s weighted IEI, a participant attending a special AI-focused program had 3.7x as much increase in impact as a participant attending a mainline workshop
- Crediting each person’s IEI to just the first CFAR program they attended, a new participant attending a special AI-focused program had 2.9x as much increase in impact as a new person attending a mainline workshop
- The variables that predict an IEI do not necessarily cause an increase in impact. For example, people who are especially excited about rationality and EA might be more likely to attend multiple CFAR events and move to the Bay Area, and also more likely to wind up having a high impact. There were similar issues on other questions (not reported here) which asked people which aspects of CFAR they found helpful.
- Roughly half the alumni didn’t fill out the impact survey, and many who took the survey did not give many details. About 1/4 of those who took the survey were counted as “no IEI” because we didn’t have enough information to tell whether they had an IEI or not (rather than because they clearly did not have an IEI).
Alumni Interviews & Profiles
In July 2017 we requested 20-minute interviews with 29 of our highest-impact alumni. We focused especially on people whose work was relevant (directly or indirectly) to existential risk, and for whom it seemed plausible that CFAR played a role in their impact. We excluded people whose impact primarily came from working at CFAR. We completed profiles of 22 of these alumni (excluding 2 who didn’t respond or didn’t want to be profiled, plus 5 for whom it seemed that CFAR didn’t much affect their path).
These profiles contain personal information about individual alums, so we aren’t sharing them publicly. For reference, they were essentially expanded and more structured versions of the case studies that we did a year ago, such as Benya Fallenstein’s case study.
Benya’s example case study:
Benya Fallenstein was the first major research hire that MIRI made after choosing to focus on technical research, and is the lead researcher on the agent foundations research agenda (one of MIRI’s two research agendas). Before coming to a CFAR workshop in July 2013, Benya had collaborated with MIRI on research while attending the University of Vienna. MIRI had discussed hiring her full-time, but she was very hesitant to do so because (for various hard-to-articulate reasons) the idea felt psychologically untenable to her. In a dialog with a CFAR staff member shortly after the July 2013 workshop, Benya was able to figure out why leaving her PhD program felt so bad (making use of CFAR techniques such as goal factoring). She then realized that these downsides were fixable, and made plans to come work for MIRI which she felt met her needs and seemed tenable. MIRI Executive Director Nate Soares attributes much of MIRI’s success in pivoting towards pure technical research to Benya’s influence, noting that in addition to her strong technical work she has played the primary role in creating MIRI’s research culture.
These 22 alumni include people working at MIRI, CEA, 80,000 Hours, Open Philanthropy Project, DeepMind’s safety team, the Center for Human-Compatible AI at UC Berkeley (CHAI), and LessWrong 2.0, as well as people who are working independently on high-impact projects, and some who are still early in their career who we expect to have impact in the future.
Broken down by how far along they are on their path towards impact:
- 10 were included based primarily on the impact of their current role or past work
- 12 were included based primarily on expected future work. Of these:
- 4 also have substantial impact from their past work or current role
- 2 have new accomplishments or jobs since we profiled them in August 2017, and now would be included in the “current role or past work” category
Broken down by cause area:
- 12 were included based on their impact on AI safety, 9 for technical work and 3 for non-technical work.
- 10 were included based on their impact on the EA community (e.g., work at 80k or LessWrong 2.0), which indirectly affects many cause areas including AI safety
About the Interviews & Profiles
The first (and briefer) portion of the interview was about the person’s impact on the world. If the person’s IEI was due to past or current work, we asked about their accomplishments and role (especially the highest-impact aspects of them). If their IEI was mainly due to our expectation of their future work, we asked about their plans and also about what they’d already done (even if their previous work wasn’t especially high-impact) in order to get a sense of where they were on the path towards having high impact.
Most of the interview focused on if/how CFAR affected their path. We emphasized the counterfactual world framing, asking them questions about what they thought “counterfactual you” would be doing in a world where they hadn’t interacted with CFAR. In addition to being central to evaluating CFAR’s counterfactual impact, this framing also made people less likely to agreeably attribute their success to CFAR because the natural positive response is “Yes, counterfactual me would be doing valuable things” rather than “Yes, CFAR has made a difference in my life.”
We also asked them about non-CFAR influences on their trajectory. We hoped that this would help them maintain perspective in thinking about their trajectory as a whole (rather than zooming their attention in narrowly on the most CFAR-related parts), and we expected that it would help us get a sense of the size of CFAR’s role by comparing it with the size of other critical factors.
The interview format gave us the opportunity to ask them for specifics about their trajectory, in a way that it is difficult to do on a survey. If they have found a particular CFAR technique useful, what have they used it for? If they got value from talking to people at their workshop, was there a specific conversation that stands out? These details made it easier for us to form our own impression of how much (or how little) of a role CFAR played for them, and also provided a great deal of information about which aspects of our programs have been most impactful.
Once the interviews were complete we compiled each person’s information into a structured, ~2 page profile, which we then sent to them to check for accuracy (which sometimes led to a few rounds of revisions). The CFAR staff were then able to look through the set of profiles, both 1) to get a subjective sense of the size of CFAR’s counterfactual impact on the world, and 2) to learn details about the pathways by which that impact occurred.
Using the Profiles to Assess Size of Impact
To get an overall subjective sense of CFAR’s counterfactual impact on the world, we separately considered the size of the alum’s impact and the size of CFAR’s role in enabling it.
In considering the size of each alum’s expected impact, we took a perspective similar to the one in 80k’s essay on talent gaps in the EA community. For example, for people who are staff members at EA orgs, we asked ourselves questions like:
- How central is this person’s role in the organization?
- How good a fit are they for that role?
- How replaceable are they?
- How large a donation would the org need to have received in order for them to be indifferent between the donation and adding this staff member?
We did not expect to come away with precise, easy-to-agree-on answers to these questions; the main point was to highlight the relevant considerations.
In considering the size of CFAR’s role, we asked ourselves questions like:
- What’s our best guess at what “counterfactual them” would be doing? What’s the most likely possibility? How likely is it that they would be doing roughly the same thing?
- How big does CFAR’s role seem, compared to the other influences on their trajectory?
- If they described increased effectiveness, how relevant do the changes seem to their work? Would they have a different job? Would there be noticeable differences in the quality of their work?
- If we had to assign a number from 0% to 100% for CFAR’s share of responsibility for the IEI, what would it be?
In general, we found that we had more uncertainty about the size of a person’s total expected impact than about the size of CFAR’s role in enabling it.
13 of the 22 people profiled seemed to us to have an increase in expected impact that is comparable to (or larger than) the value of a new hire at an EA org.
Using the Profiles to Assess Pathways of Impact
Here are some notable patterns that we saw in terms of features that were unusually common among these 22 alums:
- Previous interest in AI risk: all 12 of the people doing direct work in AI safety were aware of the topic prior to attending CFAR, and for 8 of the 12 it was a major focus (e.g. they had already collaborated with MIRI on some research).
- Promising incoming participants: most of the 22 high-impact alums are people who we already recognized as being especially promising when they were incoming participants.
- Special programs: for 4 of the 9 people doing technical AI safety work (all of whom were among the 13 people with the highest increase in expected impact), the most influential part of their involvement was a special AI-focused program. For context, about 1/9 of CFAR’s programs have included an AI safety focus.
- Many of the people who have been high impact have had repeated or extensive involvement with CFAR or the surrounding community, and this seemed to affect their trajectory.
Here are some common themes in the causal stories that alums gave about how their interactions with CFAR contributed to their impact:
- Thought more strategically about their options, which led to different career decisions.
- For example, people described themselves as becoming more willing to consider unconventional options (like graduating from university early), more able to find clever ways to get the best of both worlds (switch to a new path while still getting the good things about the old path), more able to clearly see when their current path was stuck/unsatisfying and actively seek out alternatives, or more prone to seeking out conversations with people who could help them think through a decision.
- Gained increased comfort/groundedness/thoughtfulness when thinking about existential risk.
- For example, it became easier to think separately about the three topics of how developments in AI are likely to proceed, the social dynamics around discussions about AI risk, and their own personal motivations for their career and life. Or: it became easier to take the possibility of existential risk seriously without feeling overwhelmed.
- Learned a rationality technique which they use regularly.
- People described how their use of the technique has led to specific changes such as getting better at prioritizing which tasks to work on, understanding computer programs more deeply, or being more useful when their colleagues ask them for help.
- Improved at navigating their motivations.
- For example, became more aware of what felt motivating, more able to sort through conflicting motivations, less inhibited by anxiety, or more viscerally motivated by the things they cared about.
- Became more involved in the EA or rationality communities.
- For example, they became more excited about EA after talking with cool EAs at the workshop (which led to attending EA Global, which led to…), became friends with other workshop participants who are involved in EA or rationality, or found an EA-related job via people who they met at CFAR.
- The profiles contain detailed information about particular people’s lives, and our method of looking at them involved sensitive considerations of the sort that are typically discussed in places like hiring committees rather than in public. As a result, our analysis can’t be as transparent as we’d like and it is more difficult for people outside of CFAR to evaluate it or provide feedback.
- We might overestimate or underestimate the impact that a particular alum is having on the world. Risk of overestimation seems especially high if we expect the person’s impact to occur in the future. Risk of underestimation seems especially high if the person’s worldview is different from ours, in a way that is relevant to how they are attempting to have an impact.
- We might overestimate or underestimate the size of CFAR’s role in the alum’s impact. We found it relatively easier to estimate the size of CFAR’s role when people reported career changes, and harder when they reported increased effectiveness or skill development. For example, the September 2016 CFAR for Machine Learning researchers (CML) program was primarily intended to help machine learning researchers develop skills that would lead them to be more thoughtful and epistemically careful when thinking about the effects of AI, but we have found it difficult to assess how well it achieved this aim.
- We only talked with a small fraction of alumni. Focusing only on these 22 alumni would presumably undercount CFAR’s positive effects. It could also cause us to miss potential negative effects: there may be some alums who counterfactually would have been doing high-impact work, but instead are doing something less impactful because of CFAR’s influence, and this methodology would tend to leave them out of the sample.
- This methodology is not designed to capture broad, community-wide effects which could influence people who are not CFAR alums. For example, one alum that we interviewed mentioned that, before attending CFAR, they benefited from people in the EA/rationality community encouraging them to think more strategically about their problems. If CFAR is contributing to the broader community’s culture in a way that is helpful even to people who haven’t attended a workshop, then that wouldn’t show up in these analyses or the IEI count.
- When attempting to shape the future of CFAR in response to these data, we risk overfitting to a small number of data points, or failing to adjust for changes in the world over the past few years which could affect what is most impactful for us to do.
We have been incorporating information from the impact survey and alumni profiles in our big picture thinking about CFAR’s mission, and in many of our specific plans such as what sorts of participants to seek out, what techniques to focus our development efforts on, and which skills to focus on in staff training.
A few broad trends in the data seem worth highlighting as especially relevant to CFAR’s mission:
- Special programs focused on AI safety, as well as participants who were already interested in AI safety at the time of the workshop, have been disproportionately likely to have high impact.
- Many of the profiled alums report that they have higher impact because of improvements related to navigating their motivations, thinking clearly and strategically, or other fundamental aspects of applied rationality.
- Many of the effects depend on ways in which CFAR is embedded in the broader EA and rationality communities.
CFAR’s executive director, Pete Michaud, has a post about our plans for the upcoming year which draws on this analysis.