Explaining MOOC Completion and Retention in the Context of Student Intent

About five minutes into my conversation with Steve Kolowich, technology reporter for the Chronicle of Higher Education, I realized that I hadn't gotten the presentation right.
Steve had read my most recent paper, MOOC Completion and Retention in the Context of Student Intent, and he was trying to explain it to his audience of higher education professionals.
One of the challenges of writing academic work is trying to communicate to multiple audiences. One the one hand, I'm trying to assure fellow researchers of the rigor of my analyses and provide them with detailed statistical evidence, but at the same time I'm trying to make sure that my work is accessible to wide audiences (even if they skim some of the nerdy stuff). A short conversation with Steve convinced me that I had perhaps drifted too far in the direction of scholars. I can, however, rectify some of that here.
The paper, published in EDUCAUSE Review Online, uses data from pre-course surveys from students in nine HarvardX courses from 2013-2014. We asked students about their intentions vis a viz the course, whether they intended to complete the course and earn a certificate, complete part of the course, browse the course, or if they were unsure. I then used these data to address the question: what are MOOC completion rates among those who intended to complete?
To answer this question, I looked at each of the nine courses. For each course, I looked at how people answered the intention survey question, then calculated the percentage of students in each category in each class. Then I figured out the percentage within each stated intention group (complete, audit, browse, unsure) that earned a certificate. I then averaged these findings across the nine classes, so that classes with larger enrollments would not be weighted more highly than classes with small enrollments. For instance, across the nine courses I looked at, 58% of survey respondents intended to complete the course, and 22% of these folks actually earned a certificate. If we ran a 10th class, and you had to guess what a certification rate for intended-completers would be, in the absence of any other data, 22% would be a good choice.
One problem with this approach is that it's hard to quantify in your head. What is 22% of 58%? So in talking with Steve, he was interested not in course averages, but in the entire group of nearly 80,000 students who took the survey, ignoring what class they were in. When looking at things from this perspective, things become a little simpler. 79,535 people took the survey. Of all students, 56% intended to complete, or 44,534. Why not 58%? Remember, I was using an unweighted course average, so the results are slightly different (one large course had a smaller percentage of people intending to complete). Of these 44,535 intended-completers, 19.5% or 8,621 actually did so. Again, 19.5% is smaller than 22% because the largest of the nine classes had a lower completion rate.
Table 1: Percentage of Survey Respondents and Completion Rate by Stated Level of Intention

% of Survey Respondents Survey Respondents % Complete # Certs
All 1 79525 13.3 10531
Intend-Complete 0.56 44534 19.5 8621
Intend-Audit 0.26 20709 4.9 1008
Intend-Browse 0.03 2636 3.7 96
Intend-Unknown 0.15 11826 6.9 806

From a technical perspective, I prefer the unweighted course averages because we are typically trying to predict the future at the course level (How many people in my course will complete?) rather than the individual level (What is the probability this person will complete?) But, you can't do simple, aggregate number breakdowns this way, because of the nine sub-units, so Steve's suggestion is more intuitive.
Here's another problem with my analysis. Steve was looking for a natural comparison for the 19.5% number. He recognized that it should not be 5%, the average completion rate for all MOOC students, because survey respondents are different from non-respondents. His best guess was 13.3%. But that's not a good reference either, because the people with the 19.5% completion rate (intended completers), make up 56% of the people in the "13.3% group" (all survey respondents). Steve couldn't find the right reference group, because I didn't provide it. Really, what you want is to compare everyone who intended to complete the course with everyone who didn't intend to complete the course. See below.

% of Survey Respondents Survey Respondents % Complete # Certs
All 1 79525 13.3 10531
Intend Complete 0.56 44534 19.5 8621
Do not  Intend to Complete 0.44 35171 5.4 1910

The problem was that in slicing people up into four groups, I had made the best comparison group too complex. In order to give more granular detail to scholarly colleagues and other nerds interested in the nitty-gritty, I made it hard to communicate a simple but meaningful comparison to broader audiences.
Thus, it's in Steve's reporting about my article, rather than in my actual article, that has what might be the best summary of findings from this research:

The overall completion rate among survey respondents was 13.3 percent.
Among those who had intended to complete the course, the rate was 19.5 percent.
Among those who had not intended to complete the course, it was 5.4 percent.

I wrote this post for two reasons. One, I wanted to publish these two tables somewhere. But second, I wanted to write a bit about these tensions of competing audiences and balancing complexity and simplicity. Basically, what I need to do is show all my papers to journalists before I publish them, and ask them to help me think through how they would make them accessible to wider audiences.