DESIGN AND EVALUATION OF A RETRIEVAL-AUGMENTED AI TUTOR FOR ACADEMIC ENGLISH WRITING IN SAUDI HIGHER EDUCATION
DOI:
https://doi.org/10.18623/rvd.v23.5888Palabras clave:
Retrieval‑Augmented Generation, Large Language Models, Academic Writing, EFL, Saudi Higher Education, Evaluation Metrics, Faithfulness, Governance, Learning TransferResumen
There has been growing interest in the role that academic English writing plays in assessment, publishing, and employment opportunities for initiatives that aim at improving national capability in Saudi Arabia [29,30]. However, in parallel with growing popularity of use of generative AI for writing amongst students, validating this AI raises challenge as it can produce unlimited writing fluently with either false references or overwrite the authoring process while masking their logical thinking behind elegant writing. Retrieval-Augmented Generation (RAG) is a technical remedy as it entails generation of text by retrieving knowledge from the source corpus which can be more readily auditable [3,4]. This research paper provides a review of existing literature published in the years 2020-2025 covering various aspects of RAG architecture, retrieval evaluation, LLM adaptation, and academic integrity governance towards designing and evaluating an AI tutor based on the RAG technology for academic English writing. In compliance with PRISMA guidelines [1] for the design-synthesis methodology of systems [2], a methodology strategy adopted for this purpose encompasses failure mode analysis vis-à-vis system controls and performance indicator analysis resulting in (i) the reference architecture consisting of authorized corpus, hybrid retrieval and re-ranking, schema awareness in tutoring, quality gates, and monitoring and (ii) two synthesis tables of design criteria and evaluation standards along with deployment requirements.
Citas
Alawwad HA, Alhothali A, Naseem U, Alkhathlan A, Jamal A. Enhancing textual textbook question answering with large language models and retrieval-augmented generation. Pattern Recognition. 2025;162:111332.
Bender EM, Gebru T, McMillan-Major A, Shmitchell S. On the dangers of stochastic parrots: can language models be too big? FAccT. 2021.
Bittle K, El-Gayar O. Generative AI and academic integrity in higher education: a systematic review. Information. 2025;16(4):296.
Bommasani R, Hudson DA, Adeli E, et al. On the opportunities and risks of foundation models. 2021.
Brown TB, Mann B, Ryder N, et al. Language models are few-shot learners. NeurIPS. 2020.
Dettmers T, Pagnoni A, Holtzman A, Zettlemoyer L. QLoRA: efficient finetuning of quantized LLMs. 2023.
European Commission. Ethical guidelines on the use of AI and data in teaching and learning for educators. 2022.
European Parliament and Council. Artificial Intelligence Act. 2024.
Guu K, Lee K, Tung Z, Pasupat P, Chang M. REALM: retrieval-augmented language model pre-training. ICML. 2020.
Hu EJ, Shen Y, Wallis P, et al. LoRA: low-rank adaptation of large language models. 2021.
ISO/IEC. ISO/IEC 27001: information security management systems—requirements. 2022.
Karpukhin V, Oguz B, Min S, et al. Dense passage retrieval for open-domain question answering. EMNLP. 2020.
Kasneci E, Sessler K, Küchemann S, et al. ChatGPT for good? Opportunities and challenges for education. Learning and Individual Differences. 2023;103:102274.
Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. NeurIPS. 2020.
Lo CK. Impact of AI writing tools on student learning: a systematic review. Computers and Education: Artificial Intelligence. 2024;5:100163.
Nguyen Thi XH, Hoang Thien HV, Vuong KN, Nguyen TT. Enhancing writing skills through AI-powered tools: perceived benefits and challenges among EFL students. Discover Education. 2025;4:472.
NIST. Privacy framework 1.0. 2020.
NIST. Secure software development framework (SSDF). 2022.
NIST. Artificial intelligence risk management framework (AI RMF 1.0). 2023.
Nogueira R, Jiang Z, Lin J. Document ranking with a pretrained sequence-to-sequence model. Findings of EMNLP. 2020.
OpenAI. GPT-4 technical report. 2023.
Ouyang L, Wu J, Jiang X, et al. Training language models to follow instructions with human feedback. NeurIPS. 2022.
Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71.
Saudi Ministry of Education. Higher education and research strategy aligned to Vision 2030. 2020.
Saudi Vision 2030. Vision 2030 annual report 2023. 2024.
Snyder H. Literature review as a research methodology: an overview and guidelines. Journal of Business Research. 2020;104:333–339.
Thakur N, Reimers N, Daxenberger J, et al. BEIR: a heterogeneous benchmark for zero-shot evaluation of information retrieval models. 2021.
UNESCO. AI and education: guidance for policy-makers. 2021.
UNESCO. Guidance for generative AI in education and research. 2023.
Zhai X. ChatGPT user experience: implications for education. Computers and Education: Artificial Intelligence. 2022;3:100085.
Descargas
Publicado
Cómo citar
Número
Sección
Licencia
I (we) submit this article which is original and unpublished, of my (our) own authorship, to the evaluation of the Veredas do Direito Journal, and agree that the related copyrights will become exclusive property of the Journal, being prohibited any partial or total copy in any other part or other printed or online communication vehicle dissociated from the Veredas do Direito Journal, without the necessary and prior authorization that should be requested in writing to Editor in Chief. I (we) also declare that there is no conflict of interest between the articles theme, the author (s) and enterprises, institutions or individuals.
I (we) recognize that the Veredas do Direito Journal is licensed under a CREATIVE COMMONS LICENSE.
Licença Creative Commons Attribution 3.0


