S. Martínez, M. Wimmer, J. Cabot: Efficient plagiarism detection for software modeling assignments, Computer Science Education, 2020, Vol. 30, Issue 2, pages 187-215 Doi: 10.1080/08993408.2020.1711495


Reports suggest plagiarism is a common occurrence in universities. While plagiarism detection mechanisms exist for textual artifacts, this is less so for non-code related ones such as software design artifacts like models, metamodels or model transformations.

Objective:
To provide an efficient mechanism for the detection of plagiarism in repositories of Model-Driven Engineering (MDE) assignments.

Method:
Our approach is based on the adaptation of the Locality Sensitive Hashing, an approximate nearest neighbor search mechanism, to the modeling technical space. We evaluate our approach on a real use case consisting of two repositories containing 10 years of student answers to MDE course assignments.

Findings:
We have found that: (i) effectively, plagiarism occurred on the aforementioned course assignments (ii) our tool was able to efficiently detect them.

Implications:
Plagiarism detection must be integrated into the toolset and activities of MDE instructors in order to correctly evaluate students.

Efficient plagiarism detection for software modeling assignments