Student Voice

Achieving algorithmic fairness in machine learning models of student performance

By Eve Bracken-Ingram

At student voice, we aim to help students succeeded in higher education. A key aspect of this goal in general terms is identifying the students most at risk of underperformance, allowing for preventative measures to be put in place. Machine learning can be used to identify patterns within data and suggest probabilities certain outcomes, and therefore may provide the crucial ability of identifying at risks students early in their higher education journey.

A paper by Karimi-Haghighi et al. [Source] developed a machine learning method which could predict the risks of university dropout and underperformance based on factors which are known at university enrolment. Several factors have been identified as indicators for potential academic struggles including student demographic, high school type and location, and average admission grade were considered. Although these indicators have been linked to higher education performance in literature, it is important to note that underperformance may be due to a variety of personal or institutional reasons which are difficult to quantify. In addition, several subgroups are often over-represented in university dropout rates, a fact which must be carefully considered when assessing potential underperformance.

Algorithmic fairness in machine learning is extremely important. It has several, sometimes conflicting definitions, as explored in a further Student Voice blog article [1]. However, in its simplest form, algorithmic fairness means ensuring that models do not display a discriminatory bias towards certain groups. As machine learning models are trained upon input data, bias can be inadvertently built into a model. In Karimi-Haghighi et al.’s exploration of student underperformance, fairness is measured through the error rate metrics generalised false positive rate (GFPR) and generalised false negative rate (GFNR). In addition to these error metrics, calibration was also monitored. Calibration can be defined as an estimate of the probability of an outcome and maintaining calibration allows for results to be more easily interpreted across groups. Accuracy was also considered in the development of the model as Area Under the ROC Curve (AUC). Models were passed through a bias mitigation procedure which aimed to equalise error rates whilst maintaining constant calibration [2].

The model was trained and tested using a dataset of 881 computer science students. This dataset was analysed per group, considering age, gender, nationality, academic performance, and high school type. In this set, foreign students were significantly more likely to underperform than their national counterparts. Additionally, students who failed a course or were required to resit an exam in first year showed greater risk of dropout. This dataset had an incredibly high gender imbalance and therefore the SMOTE algorithm [3] was used to balance distribution by randomly sampling existing minority cases using interpolation.

A Multi-Layer Perceptron with a hidden layer of 100 neutrons was found to show the best results. It showed good AUC when compared to existing studies, although males and students with lower admission grades were found to have higher accuracy than their counterparts. Across groups, models showed good equity in GFNR. This means that models were equally as likely to falsely predict a negative outcome regardless of gender, age, nationality, high school or academic performance. The GFPR showed greater disparity, particularly biased against students with low admission grades, but fairness was increased following bias mitigation. Through this process, equity was also increased for GFNR and AUC across most groups.

The creation of a model which was able to fairly predict dropout and underperformance of higher education students based on information known at enrolment allows for opportunities for educational improvement to be identified. In addition to providing additional support for at risk students, there is also the possibility for further opportunities to be offered to potentially successful students. Accurate prediction allows for resources to be carefully and meaningfully allocated, and fairness ensures that certain student groups are not subject to unfair bias.

FAQ

Q: How does Student Voice plan to implement the findings of the Karimi-Haghighi et al. study into real-world educational settings?

A: Student Voice aims to translate the findings from the Karimi-Haghighi et al. study into practical strategies that educational institutions can use to support students effectively. This involves developing systems that can integrate the machine learning model into the existing student information systems of universities and colleges. By doing so, educational administrators and teachers can receive early alerts about students who may be at risk of underperformance or dropout. The idea is to use these insights to tailor support mechanisms, such as tutoring, counselling, and academic advising, specifically to the needs of at-risk students. The implementation would focus on collaboration with educational stakeholders to ensure that the interventions are both effective and sensitive to the diverse needs of the student body.

Q: What are the ethical considerations and potential privacy concerns associated with using machine learning to analyse students' data for predicting underperformance and dropout rates?

A: The use of machine learning to analyse students' data raises significant ethical considerations and privacy concerns. Ethically, it's crucial to ensure that the predictive models are used in a way that benefits students without stigmatizing or unfairly labelling them based on their predicted risk of underperformance. Privacy concerns revolve around how students' data is collected, who has access to it, and how it is used. Student Voice advocates for transparent communication with students about how their data is being used and ensuring that data collection and analysis comply with data protection laws. Consent from students or their guardians, where applicable, is essential before their data is analysed. Additionally, maintaining the anonymity and security of student data is paramount to prevent misuse or breaches that could harm students' privacy.

Q: How does the inclusion of text analysis enhance the predictive capabilities of the machine learning models discussed by Karimi-Haghighi et al., if at all?

A: While the original study by Karimi-Haghighi et al. does not explicitly mention the use of text analysis, incorporating this method could significantly enhance the predictive capabilities of machine learning models in educational settings. Text analysis could involve examining students' written assignments, feedback from teachers, or even posts on educational forums to gain deeper insights into students' engagement, comprehension, and emotional states. This approach could identify subtle indicators of stress, confusion, or disengagement that are not evident through traditional data points like grades and attendance. By integrating text analysis, Student Voice could help educational institutions understand the nuanced challenges students face, enabling more personalised and timely interventions. This holistic approach ensures that support is not only based on academic performance but also considers the students' emotional and psychological well-being.

References

[Source] Marzieh Karimi-Haghighi, Carlos Castillo, Davinia Hernández-Leo and Veronica Moreno Oliver (2021) Predicting Early Dropout: Calibration and Algorithmic Fairness Considerations. Companion Proceedings 11th International Conference on Learning Analytics & Knowledge
DOI: 10.48550/arXiv.2103.09068

[1] David Griffin, Definitions of Fairness in Machine Learning Explained Through Examples. Student Voice

[2] Pleiss, G. a. (2017). On fairness and calibration. Advances in Neural Information Processing
DOI: 10.48550/arXiv.1709.02012

[3] Chawla, N. V. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 321-357
DOI: 10.48550/arXiv.1106.1813

Related Entries