REDUNDANCY REDUCTION AND SENTENCE PRIORITISATION OF THE STUDENT LECTURE NOTES USING SOFT COSINE IMPLEMENTED MMR ALGORITHM

Authors

DOI:

https://doi.org/10.18623/rvd.v22.n4.3612

Keywords:

Maximal Marginal Relevance (MMR), Redundancy Reduction, Sentence Prioritisation, Soft Cosine, Contextual Tokenization

Abstract

In the realm of education, the efficacy of lecture notes in aiding student learning is pivotal. This study explores the integration of the Soft cosine measure (SCM) with the Maximal marginal relevance (MMR) algorithm to reduce redundancy and prioritize essential sentences in student lecture notes. Traditional cosine similarity often overlooks the semantic similarity between terms with different surface forms, leading to suboptimal redundancy reduction. The Soft Cosine Measure addresses this by accounting for word similarity based on semantic relationships, improving sentence relevance and uniqueness assessment. In our approach, SCM is employed within the MMR framework to iteratively select sentences that maximize relevance to the main lecture content while minimizing redundancy with previously selected sentences. This hybrid method ensures that the final summary encompasses a broader range of key concepts and topics discussed during lectures, providing a more comprehensive and coherent overview. Experimental evaluations on a dataset of lecture notes demonstrate that the SCM-implemented MMR algorithm significantly outperforms traditional summarization techniques in both reducing redundancy and maintaining the informativeness of summaries. The Soft Cosine Implemented MMR Algorithm (SCIMMR) gives the ROUGE values between 0.821 to 0.901. This novel approach offers a robust solution for the automatic summarization of lecture notes, aiding students in efficiently reviewing and studying educational materials.

References

Alizadeh, M., & Seilsepour, A. (2025). A novel self-supervised sentiment classification approach using semantic labeling based on contextual embeddings. Multimedia Tools and Applications 84, 10195–10220. https://doi.org/10.1007/s11042-024-19086-y

Aranzamendez, S.G, et al., (2024). An Enhanced Content-based Filtering Using Maximal Marginal Relevance. International Journal of Computing Sciences Research. Vol. 8, pp. 3070-3087. https://doi.org/10.25147/ijcsr.2017.001.1.204

Carbonell, J., & Goldstein, J. (1998). The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 335–336.

Chistikov, P., & Khomitsevich, O. (2013). Improving prosodic break detection in a Russian TTS system. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8113 LNAI(3), 181–188. https://doi.org/10.1007/978-3-319-01931-4_24

Colombo, M. (2024). Semantic Similarity Measures. In Phenotropic Interaction (pp. 49–69). https://doi.org/10.1007/978-3-031-42819-7_4

Demilie, W. B. (2022). Comparative Analysis of Automated Text Summarization Techniques: The Case of Ethiopian Languages. Wireless Communications and Mobile Computing, 1–28. https://doi.org/10.1155/2022/3282127

Erkan, G., & Radev, D. R. (2011). LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. Journal Of Artificial Intelligence Research, 22(1), 457–479. https://doi.org/10.1613/jair.1523

Faisal Rahutomo, Teruaki Kitasuka, & Masayoshi Aritsugi. (2012). Semantic Cosine Similarity. The 7th International Student Conference on Advanced Science and Technology ICAST, 4(1), 4–5.

Gunawan, G., Fitria, F., Setiawan, E. I., & Fujisawa, K. (2023). Maximum Marginal Relevance and Vector Space Model for Summarizing Students’ Final Project Abstracts. Knowledge Engineering and Data Science, 6(1), 57. https://doi.org/10.17977/um018v6i12023p57-68

Ijebu, F.F., Liu, Y., Sun, C., & Usip, P.U. (2025). Soft cosine and extended cosine adaptation for pre-trained language model semantic vector analysis. Applied Soft Computing 169: 112551. https://doi.org/10.1016/j.asoc.2024.112551

Jain, M., & Rastogi, H. (2020). Automatic Text Summarization using Soft-Cosine Similarity and Centrality Measures. Proceedings of the 4th International Conference on Electronics, Communication and Aerospace Technology, ICECA 2020, 1021–1028. https://doi.org/10.1109/ICECA49313.2020.9297583

Januzaj, Y., & Luma, A. (2022). Cosine Similarity – A Computing Approach to Match Similarity Between Higher Education Programs and Job Market Demands Based on Maximum Number of Common Words. International Journal of Emerging Technologies in Learning, 17(12), 258–268. https://doi.org/10.3991/ijet.v17i12.30375

Jiang, P., & Cai, X. (2024). A Survey of Text-Matching Techniques. Information 15(6): 332. https://doi.org/10.3390/info15060332.

Kaur, N., (2024). A Review on String-Based Text Similarity Techniques in Computational Analysis. International Journal of Intelligent Systems and Applications In Engineering 12(23s), 3138–3144.

Leskovec, J., Rajaraman, A., & Ullman, J. D. (2011). Mining of Massive Datasets. Cambridge University Press. https://doi.org/10.1017/CBO9781139058452

Locke, E. A. (2015). An empirical study of lecture note taking among college students. Journal of Educational Research, 71(2), 93–99. https://doi.org/10.1080/00220671.1977.10885044

Mao, Y., Qu, Y., Xie, Y., Ren, X., & Han, J. (2020). Multi-document summarization with maximal marginal relevance-guided reinforcement learning. EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 1737–1751. https://doi.org/10.18653/V1/2020.EMNLP-MAIN.136

Prasetya Wibawa, A., & Kurniawan, F. (2024). A survey of text summarization: Techniques, evaluation and challenges. Natural Language Processing Journal, 7, 1–21. https://doi.org/10.1016/j.nlp.2024.100070

Renkl, A., & Atkinson, R. K. (2003). Structuring the Transition From Example Study to Problem Solving in Cognitive Skill Acquisition: A Cognitive Load Perspective. Educational Psychologist, 38(1), 15–22.

Sc, I. M., Science, C., Intelligence, A., Science, C., & Science, D. (n.d.). Integrated M . Sc . Programmes in Computer Science.

Sidorov1, G., Gelbukh1, A., Gomez-Adorno1, H., & Pinto2, D. (2014). Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model. Computacion y Sistemas, 18(3), 491–504. https://doi.org/10.13053/CyS-18-3-2043

Upadhay, N., & Singh, U. (2020). A Review on Requirements Prioritization Techniques. International Journal of Creative Research Thoughts, 8(12), 877–881. https://www.researchgate.net/publication/358962528_A_Review_on_Requirements_Prioritization_Techniques

Wang, Z., Zhang, H., Chen, J., & Chen, H. (2024). An effective framework for measuring the novelty of scientific articles through integrated topic modeling and cloud model.” Journal of Informetrics 18(4): 101587. https://doi.org/10.1016/j.joi.2024.101587

Downloads

Published

2025-11-21

How to Cite

Baby, A., V, V., & Jose, J. (2025). REDUNDANCY REDUCTION AND SENTENCE PRIORITISATION OF THE STUDENT LECTURE NOTES USING SOFT COSINE IMPLEMENTED MMR ALGORITHM. Veredas Do Direito, 22(4), e223612. https://doi.org/10.18623/rvd.v22.n4.3612