We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.

Durham Research Online
You are in:

Towards Designing Profitable Courses: Predicting Student Purchasing Behaviour in MOOCs

Alshehri, Mohammad and Alamri, Ahmed and Cristea, Alexandra I. and Stewart, Craig D. (2021) 'Towards Designing Profitable Courses: Predicting Student Purchasing Behaviour in MOOCs.', International Journal of Artificial Intelligence in Education, 31 (2). pp. 215-233.


Since their ‘official’ emergence in 2012 (Gardner and Brooks 2018), massive open online courses (MOOCs) have been growing rapidly. They offer low-cost education for both students and content providers; however, currently there is a very low level of course purchasing (less than 1% of the total number of enrolled students on a given online course opt to purchase its certificate). The most recent literature on MOOCs focuses on identifying factors that contribute to student success, completion level and engagement. One of the MOOC platforms’ ultimate targets is to become self-sustaining, enabling partners to create revenues and offset operating costs. Nevertheless, analysing learners’ purchasing behaviour on MOOCs remains limited. Thus, this study aims to predict students purchasing behaviour and therefore a MOOCs revenue, based on the rich array of activity clickstream and demographic data from learners. Specifically, we compare how several machine learning algorithms, namely RandomForest, GradientBoosting, AdaBoost and XGBoost can predict course purchasability using a large-scale data collection of 23 runs spread over 5 courses delivered by The University of Warwick between 2013 and 2017 via FutureLearn. We further identify the common representative predictive attributes that influence a learner’s certificate purchasing decisions. Our proposed model achieved promising accuracies, between 0.82 and 0.91, using only the time spent on each step. We further reached higher accuracy of 0.83 to 0.95, adding learner demographics (e.g. gender, age group, level of education, and country) which showed a considerable impact on the model’s performance. The outcomes of this study are expected to help design future courses and predict the profitability of future runs; it may also help determine what personalisation features could be provided to increase MOOC revenue.

Item Type:Article
Full text:(VoR) Version of Record
Available under License - Creative Commons Attribution 4.0.
Download PDF
Publisher Web site:
Publisher statement:This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
Date accepted:01 March 2021
Date deposited:14 April 2021
Date of first online publication:23 March 2021
Date first made open access:14 April 2021

Save or Share this output

Look up in GoogleScholar