We use cookies to ensure that we give you the best experience on our website. You can change your cookie settings at any time. Otherwise, we'll assume you're OK to continue.

Durham University

Computer Science


Publication details for Professor Alexandra Cristea

Fonseca, Samuel C. Pereira, Filipe Dwan, Oliveira, Elaine H. T., Oliveira, David B. F., Carvalho, Leandro S. G. & Cristea, Alexandra I. (2020), Automatic Subject-based Contextualisation of Programming Assignment Lists, in Rafferty, Anna N., Whitehill, Jacob, Romero, Cristobal & Cavalli-Sforza, Violetta eds, Educational Data Mining 2020 (EDM). Virtual, EDM, 81-91.

Author(s) from Durham


As programming must be learned by doing, introductory
programming course learners need to solve many problems,
e.g., on systems such as ’Online Judges’. However, as such
courses are often compulsory for non-Computer Science (nonCS) undergraduates, this may cause difficulties to learners
that do not have the typical intrinsic motivation for programming as CS students do. In this sense, contextualised
assignment lists, with programming problems related to the
students’ major, could enhance engagement in the learning
process. Thus, students would solve programming problems
related to their academic context, improving their comprehension of the applicability and importance of programming.
Nonetheless, preparing these contextually personalised programming assignments for classes for different courses is really laborious and would increase considerably the instructors’/monitors’ workload. Thus, this work aims, for the first
time, to the best of our knowledge, to automatically classify the programming assignments in Online Judges based
on students’ academic contexts by proposing a new context
taxonomy, as well as a comprehensive pipeline evaluation
methodology of cutting edge competitive Natural Language
Processing (NLP). Our comprehensive methodology pipeline
allows for comparing state of the art data augmentation,
classifiers, beside NLP approaches. The context taxonomy
created contains 23 subject matters related to the non-CS
majors, representing thus a challenging multi-classification
problem. We show how even on this problem, our comprehensive pipeline evaluation methodology allows us to achieve
an accuracy of 95.2%, which makes it possible to automatically create contextually personalised program assignments
for non-CS with a minimal error rate (4.8%).