|
ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 9. Vol. 31. 2025
DOI: 10.17587/it.31.496-503
M. A. Bubnova, Leading Programmer,
National Research University Higher School of Economics, Moscow, Russian Federation
Automated System for Detecting Plagiarism in Files Containing Program Code
Received on 26.02.2025
Accepted on 06.03.2025
The article outlines the results of developing a system designed to identify plagiarism in files containing program code. The automated system integrates multiple methods of program code analysis. It serves as a decision-support tool for university instructors by enabling the detection of assignments with plagiarism levels exceeding a predefined threshold. The system is specifically designed to evaluate student submissions within the university, allowing for the comparison of works among students in the same cohort to identify groups with similar or copied content. The system supports the analysis of assignments stored both locally and in cloud storage.
Keywords: code plagiarism, text tokenization, abstract syntax tree (AST)
P. 496-503
Full text on eLIBRARY
References
- Potthast M., Gollub T., Hagen M., Grabegger J., Kiesel J., Michel M., Oberlander A., Tippmann M., Barron-Cenedo A., Gupta P., Rosso P., Stein B. Overview of the 4th International Competition on Plagiarism Detection, CLEF 2012 Evaluation Labs Working Notes Papers, Rome, September 2012.
- Haupt R. L. Plagiarism in Journal Articles, IEEE Antennas and Propagation, Aug. 2003, vol. 45, no. 4.
- Soledad P. M., Ng Yiu-Kai. A Structural, Content-Simiญlarity Measure for Detecting Spam Documents on the Web, International Journal of Web Information Systems, 2009, pp. 431464.
- Chekhovich Yu. V., Belenkaya O. S. Evaluation of the Correctness of Borrowings in Scientific Publications, Scientific Publication at the International Level 2018: Editorial Policy, Open Access, Scientific Communications: Proc. of the 7th Int. Sci.-Pract. Conf. (Moscow, April 2427, 2018), Moscow, Your Digital Publishing House, 2018, pp. 158162, DOI: 10.24069/konf-24-27-04-2018.28.
- Nikolaev V. V., Rakhkonen M. E. Application of Various Tools and the Use of the Chatbot "ChatGPT" in Writing Scientific Papers Checked by the "Anti-Plagiarism" Program, Prof. Legal Educ. Sci., 2023, no. 1 (9), pp. 7881.
- Roy C. K., Cordy J. R., Koschke R. Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach, Science of Computer Programming, 2009, vol. 74, no. 7, pp. 470495.
- Johnson J. Visualizing Textual Redundancy in Legacy Source, Proceedings of the 1994 Conference of the Centre for Advanced Studies on Collaborative Research, CASCON 2004, 1994, pp. 171183.
- Juergens E., Deissenboeck F., Hummel B., Wagner S. Do Code Clones Matter?, Proceedings of the 31st International Conference on Software Engineering, ICSE 2009, 2009, p. 1.
- Tairas R., Gray J. Phoenix-Based Clone Detection Using Suffix Trees, Proceedings of the 44th Annual Southeast Regional Conference, ACM-SE 2006, 2006, pp. 679684.
- Wahler V., Seipel D., Gudenberg J., Fischer G. Clone Deญtection in Source Code by Frequent Itemset Techniques, Proceedings of the 4th IEEE International Workshop Source Code Analysis and Manipulation, SCAM 2004, 2004, pp. 128135.
- Liu C., Chen C., Han J., Yu P. GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis, Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, 2006, pp. 872881.
- Gabel M., Jiang L., Su Z. Scalable Detection of Semantic Clones, Proceedings of the 30th International Conference on Software Engineering, ICSE 2008, 2008, pp. 321330.
- Komondoor R., Horwitz S. Using Slicing to Identify Duplication in Source Code, Proceedings of the 8th International Symposium on Static Analysis, SAS 2001, 2001, pp. 4056.
- Kontogiannis K., DeMori R., Merlo E., Galler M., Bernstein M. Pattern Matching for Clone and Concept Detection, Journal of Automated Software Engineering, 1996, vol. 3, no. 12, pp. 77108.
- Davey N., Barson P., Field S., Frank R. The Development of a Software Clone Detector, International Journal of Applied Software Technology, 1995, vol. 1, no. 3/4, pp. 219236.
- Mayrand J., Leblanc C., Merlo E. Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics, Proceedings of the 12th International Conference on Software Maintenance, ICSM 1996, 1996, pp. 244253.
- Evtifeeva O. A., Krass A. L., Lakunin M. A., Lysenko E. A., Schastlivtsev R. R. Analysis of Algorithms for Detecting Plagiarism in Program Source Codes, Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2007, no. 5, pp. 188197.
- Liu C., Chen C., Han J., Yu P. S. GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis, Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 2023, 2006, pp. 872881.
- Engels S., Lakshmanan V., Craig M. Plagiarism Detection Using Feature-Based Neural, ACM SIGCSE Bulletin, 2007, vol. 39, no. 1, pp. 3438.
- Hage J., Vermeer B., Verburg G. Research Paper: Plagiarism Detection for Haskell with Holmes, Proceedings of the 3rd Computer Science Education Research Conference, CSERC 2013, Arnhem, The Netherlands, April 0405, 2013, pp. 1930.
- Weber R., Schek H. J., Blott S. A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces, Proceedings of the 24th VLDB Conference, New York, 1998, pp. 194205.
- Bubnova M. A., Melekh N. A. Automated System for Detecting Plagiarism in Program Codes, Interuniversity Scientific and Technical Conference of Students, Postgraduates, and Young Specialists Named after E. V. Armensky: Proc. of the Conf. (Moscow, February 25 March 4, 2020), Moscow, Moscow Institute of Electronics and Mathematics, National Research University Higher School of Economics, 2020, pp. 6162.
To the contents |
|