I’m happy to report that with your help, CFAR can make 2018 by far the highest impact year we’ve ever had.
According to our new data, a disproportionate amount of our impact comes from running special programs like the AI Summer Fellows Program, compared to mainline workshops. These special programs tend to be more expensive because they are more difficult and time consuming to run, and because they bring in less revenue1.
But we’re quite hopeful because we’re now on the cusp of being able to break our single biggest bottleneck to running these programs: by securing a permanent workshop venue.
Click here if you want the venue details, otherwise here’s the headline (see farther below for a justification of the following figures): we will save enough money and time by having a permanent venue that we estimate we can expand our participant throughput by around 33%, and our impact per participant by around 210%, increasing our estimated overall impact by 275% in 2018 compared with our historical yearly average.
Ideal 2018 Schedule
Concretely, here is a rough draft of a schedule of CFAR programs that would optimize for impact according to our metrics, given acquisition of a permanent venue:
- 2 Mainline Workshops
- 1 AI Summer Fellows Program (AISFP) — for preparing participants for contributing as technical researchers on AI alignment
- 7 Workshops for AI Safety Strategy (WAISS) — for folks who want to impact AI Safety but aren’t interested in doing original technical research
- 9 Advanced Alumni Workshops (e.g. Tier 2 workshops, mentor training, or similar) — to enable taking the content further and deeper, which our data indicate strongly predicts more impact
- 3 X-Risk Modeling workshops or similar
- 1 Alumni Reunion — to promote a vibrant community that attracts and supports new, high-impact alumni
- 1 Instructor Training Series — Four linked weekend workshops, plus guided student teaching at mainline workshops, WAISS, Hamming, and others
- 3 pilot programs that may continue in future years, or have their content folded into other programs if the metrics indicate they are sufficiently high impact. For example, we previously ran CFAR for Machine Learning (CML), and believe it may be useful, but we don’t have tracking metrics for it (yet!)
- Many test sessions and experimental programs to continue the development of the art
A quick breakdown of why I believe an estimated overall impact increase of ~275% is justified:
- Our data indicate that special programs have about 3.3 times as much impact as mainline workshops.2
- Historically, only about 20% of our programs are special programs. In the ideal 2018 scenario listed above, about 89% of the programs would be special programs.3
- Rough estimate is that this represents a ~207%4 improvement in impact from previous years.5
- In 2017, CFAR increased throughput by 61%, increasing from 167 participants in 2016 to 275 in 2017. In 2018 we’re poised to increase that number by another 33%, running 16 programs compared with 12 in 2017.6 7
- 207% average increased impact per participant × 133% participant throughput = 275% overall increase in impact compared with 2017.
Ideal 2018 Budget
First, I’ll talk about the world in which we don’t get the permanent venue.
This world is strictly worse: the cost structure, payroll expense, and operations bottlenecks in 2018 would be basically identical to 2017. That means we could likely expect similar impact and throughput next year as we had this year. We made great strides this year (with throughput in particular), but I would be disappointed if we couldn’t break our main bottleneck of acquiring the venue.
Without a venue we would run fewer workshops than would be ideal because of operations constraints. The success of this fundraiser will determine the final mix of possible programs, but without a venue we would certainly run fewer special programs for both operational and budgetary reasons.
Now, the ideal world:
There is a possible version of CFAR, however, which increases its impact by almost threefold over the next year by running lots of special programs. Assuming we acquire a permanent venue, this would cost:
- $640k for payroll
- $130k for office space and supplies
- $380k for programs (this is a marginal expense, so it doesn’t include the cost of staff time)8
- Total: ~$1.2 million (an overall reduction in budget of about 30% from last year)
We submitted a proposal for an institutional grant of at least $400k that we’re optimistic about receiving. Assuming we do receive that, plus make around $60k in workshop profits from the two mainline workshops, we would still need $740k in individual donations to make this “optimal CFAR” a reality.
If we simply meet par relative to last year’s fundraiser, by default we’ll be about $340k short. That’s why this year’s fundraiser is so important for CFAR.
The further from our target we are, the fewer special programs we would be able to run; the only way to fund special programs would be with profits from mainline workshops. In this scenario, I estimate we would run about nine mainline workshops, which would fund a much smaller number of special programs, reducing our overall 2018 impact by around 45%.9
Bottom line: it seems that if we raise 28% more money than we did last year, we would be able to deliver 45% more impact than we would in the world in which we raised the same amount – and this would be 275% more impact than we achieved last year.
But we need your help. Thanks to better metrics, our growing ability to run highly impactful specialized workshops, and the prospect of acquiring a permanent venue, we are unusually well-positioned this year to turn money into impact.
So: if you want to help make the ideal 2018 CFAR a reality,
Thank you for reading. Please feel free to ask me any questions you have at firstname.lastname@example.org!
All the best,
Special programs have most often needed to be free to participants to attract high-caliber participants regardless of their ability to pay ↩
This estimate is quite conservative as it assumed at most a 100x difference in impact between the least impactful alum and the most impactful. In fact, we estimate some alumni have had several orders of magnitude more than that. ↩
It’s worth noting that I’ve conflated proven-higher-impact programs like AISFP and WAISS with yet-unproven programs like the x-risk modelling workshop, which we assume is in a similar reference class. Our best guess is that these programs will actually be marginally lower impact than the others, perhaps by about a factor of 3 instead of a factor of 3.3. When we actually ran the numbers with these programs as a separate class, our estimated increased impact per participant went from 207% to 205%. Realistically this difference is well within the margin of error, which makes the difference negligible. So instead of carrying the more complex calculation all the way through, we stuck with the practically identical figure that we note is a little fudgey. ↩
Average impact of mainline workshops ⨉ the 80% of programs that were mainline in 2017 + average impact of special programs (3.3x that of mainline workshops) ⨉ 20% of programs that were special in 2017 = 1.46 times as much impact in 2017 as we would have had had we run just mainline workshops.
Average impact of mainline workshops ⨉ the 11% of programs that would be mainline in 2018 + average impact of special programs (3.3x that of mainline workshops) ⨉ 89% of programs that would be special in 2018 = 3.02 times as much impact in 2018 as we would have of we run just mainline workshops.
So, in 2018 we expect to have 2.07 times (3.02 / 1.46) more impact per workshop compared with 2017 by running more special programs. ↩
The estimate here is hard to make because there are lots of factors that aren’t easy to reason about concretely. One might expect, for example, that the 7th WAISS of the year would have less impact on the margin than the first WAISS, because we would already have put the best candidates through the earlier ones.
But we doubt this will have a significant impact. For one thing, it’s not at all obvious ahead of time which WAISS candidates have the highest EV. We’re selecting from a high dimensional space when we’re looking for people who might have a non-technical impact on AI alignment, and we don’t need to be as selective on any given dimension as we are for SPARC, for example.
Also, if WAISS is like other CFAR programs historically, we’ll get substantially better at running them the more we run, increasing the marginal impact of programs independent of the mean participant EV.
So while it’s possible that by WAISS 7 we’ll have run out of the highest EV candidates, it’s also possible that calendar constraints will more evenly distribute them across time, and that CFAR getting better at running the program would offset a drop in participant quality.
I’m inclined to call this a wash instead of being clever about it and just say that at this scale, our special programs will have as much impact on the margin as they have had on average in the past. If you believe the potential decline in the marginal quality of the candidate pool over the year will be enough to overwhelm the other factors, then perhaps applying a factor of .8 or so to the impact estimates would be sensible. ↩
This only counts programs for people who have not been through any CFAR program before, so mainline workshops and AISFP count here but instructor training, for example, does not, even though instructor training very much counts for our impact metrics. ↩
The average number of participants per program is roughly the same across programs. Mainline workshops can, and often do, have more participants than special programs like AISFP or WAISS, but their size is also higher variance; the average is around 23 participants, which is similar to the largest special programs, which we more consistently fill closer to capacity.
One factor that could compromise our throughput is failure to do enough outreach, such that we fill fewer seats per program than we hope to. This is possible, and there’s precedent for it since this turned out to be a problem when we tried to run six workshops in a row earlier this year, yet filled only five of them. We hope we learned from our mistakes last time, but participant recruitment for CFAR is an Actually Hard™ problem, so it’s possible we’ll fail even despite knowing of the problem ahead of time.
On the other hand, outreach for free programs like these special programs is significantly easier. ↩
This is a fermi based on 2017 costs for similar programs, less the cost of the short-term venues for those programs (but still including the amortized cost of the permanent venue), plus the additional cost of more participant scholarships made possible by the improved venue and operations. ↩
The workshop schedule ideal for maximizing impact would result in a net loss of $340k, which is why we’d like to raise that money. In the worst case, where we raise zero of that budget during this fundraiser, the breakdown of programs requires that the net be roughly $0. If we run nine mainline workshops with sufficiently limited scholarships to net $30k in profit each, then we’ll be able to afford to run AISFP once, WAISS twice, up to three small alumni workshops, and either one or two x-risk and pilot programs each. Overall we’d run about 13 fewer special programs (roughly 11 versus 24). Even accounting for the positive impact of the extra mainline workshops, we’d be looking at about 55% of the overall impact that the ideal 2018 could have had, according to our metrics. ↩