DESIGN AND EVALUATION OF A RETRIEVAL-AUGMENTED AI TUTOR FOR ACADEMIC ENGLISH WRITING IN SAUDI HIGHER EDUCATION

Asad Shafi

doi:10.18623/rvd.v23.5888

Autores/as

Asad Shafi https://orcid.org/0009-0009-9277-8330

DOI:

https://doi.org/10.18623/rvd.v23.5888

Palabras clave:

Retrieval‑Augmented Generation, Large Language Models, Academic Writing, EFL, Saudi Higher Education, Evaluation Metrics, Faithfulness, Governance, Learning Transfer

Resumen

There has been growing interest in the role that academic English writing plays in assessment, publishing, and employment opportunities for initiatives that aim at improving national capability in Saudi Arabia [29,30]. However, in parallel with growing popularity of use of generative AI for writing amongst students, validating this AI raises challenge as it can produce unlimited writing fluently with either false references or overwrite the authoring process while masking their logical thinking behind elegant writing. Retrieval-Augmented Generation (RAG) is a technical remedy as it entails generation of text by retrieving knowledge from the source corpus which can be more readily auditable [3,4]. This research paper provides a review of existing literature published in the years 2020-2025 covering various aspects of RAG architecture, retrieval evaluation, LLM adaptation, and academic integrity governance towards designing and evaluating an AI tutor based on the RAG technology for academic English writing. In compliance with PRISMA guidelines [1] for the design-synthesis methodology of systems [2], a methodology strategy adopted for this purpose encompasses failure mode analysis vis-à-vis system controls and performance indicator analysis resulting in (i) the reference architecture consisting of authorized corpus, hybrid retrieval and re-ranking, schema awareness in tutoring, quality gates, and monitoring and (ii) two synthesis tables of design criteria and evaluation standards along with deployment requirements.

Citas

Alawwad HA, Alhothali A, Naseem U, Alkhathlan A, Jamal A. Enhancing textual textbook question answering with large language models and retrieval-augmented generation. Pattern Recognition. 2025;162:111332.

Bender EM, Gebru T, McMillan-Major A, Shmitchell S. On the dangers of stochastic parrots: can language models be too big? FAccT. 2021.

Bittle K, El-Gayar O. Generative AI and academic integrity in higher education: a systematic review. Information. 2025;16(4):296.

Bommasani R, Hudson DA, Adeli E, et al. On the opportunities and risks of foundation models. 2021.

Brown TB, Mann B, Ryder N, et al. Language models are few-shot learners. NeurIPS. 2020.

Dettmers T, Pagnoni A, Holtzman A, Zettlemoyer L. QLoRA: efficient finetuning of quantized LLMs. 2023.

European Commission. Ethical guidelines on the use of AI and data in teaching and learning for educators. 2022.

European Parliament and Council. Artificial Intelligence Act. 2024.

Guu K, Lee K, Tung Z, Pasupat P, Chang M. REALM: retrieval-augmented language model pre-training. ICML. 2020.

Hu EJ, Shen Y, Wallis P, et al. LoRA: low-rank adaptation of large language models. 2021.

ISO/IEC. ISO/IEC 27001: information security management systems—requirements. 2022.

Karpukhin V, Oguz B, Min S, et al. Dense passage retrieval for open-domain question answering. EMNLP. 2020.

Kasneci E, Sessler K, Küchemann S, et al. ChatGPT for good? Opportunities and challenges for education. Learning and Individual Differences. 2023;103:102274.

Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. NeurIPS. 2020.

Lo CK. Impact of AI writing tools on student learning: a systematic review. Computers and Education: Artificial Intelligence. 2024;5:100163.

Nguyen Thi XH, Hoang Thien HV, Vuong KN, Nguyen TT. Enhancing writing skills through AI-powered tools: perceived benefits and challenges among EFL students. Discover Education. 2025;4:472.

NIST. Privacy framework 1.0. 2020.

NIST. Secure software development framework (SSDF). 2022.

NIST. Artificial intelligence risk management framework (AI RMF 1.0). 2023.

Nogueira R, Jiang Z, Lin J. Document ranking with a pretrained sequence-to-sequence model. Findings of EMNLP. 2020.

OpenAI. GPT-4 technical report. 2023.

Ouyang L, Wu J, Jiang X, et al. Training language models to follow instructions with human feedback. NeurIPS. 2022.

Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71.

Saudi Ministry of Education. Higher education and research strategy aligned to Vision 2030. 2020.

Saudi Vision 2030. Vision 2030 annual report 2023. 2024.

Snyder H. Literature review as a research methodology: an overview and guidelines. Journal of Business Research. 2020;104:333–339.

Thakur N, Reimers N, Daxenberger J, et al. BEIR: a heterogeneous benchmark for zero-shot evaluation of information retrieval models. 2021.

UNESCO. AI and education: guidance for policy-makers. 2021.

UNESCO. Guidance for generative AI in education and research. 2023.

Zhai X. ChatGPT user experience: implications for education. Computers and Education: Artificial Intelligence. 2022;3:100085.

DESIGN AND EVALUATION OF A RETRIEVAL-AUGMENTED AI TUTOR FOR ACADEMIC ENGLISH WRITING IN SAUDI HIGHER EDUCATION

Autores/as

DOI:

Palabras clave:

Resumen

Citas

Descargas

Publicado

Cómo citar

Número

Sección

Licencia

Enviar un artículo

Scopus

Scimago

CiteScore

Visitas

Idioma