ARTIFICIAL INTELLIGENCE IN LINGUISTICS: MODELING UNIVERSAL PHONOLOGICAL SYSTEMS FOR SUSTAINABLE COMMUNICATION

Autores

DOI:

https://doi.org/10.18623/rvd.v23.5126

Palavras-chave:

Linguistic Typology, Interlinguistics, Universal Phonology, Orthography, Multilingual Fairness, NLP

Resumo

Objective: this study examines how artificial intelligence (AI) can be combined with empirical linguistic data to develop models of universal phonological and orthographic systems. The broader aim is to contribute to more sustainable and inclusive tools for cross-linguistic communication. The work focuses on a central challenge in contemporary linguistics: the lack of reproducible AI-driven methods that link computational modeling with theoretical analysis and that ensure fair and accessible use of digital language technologies. Method: A mixed-method framework has been adopted, in which corpus-driven linguistic analysis has been integrated with neural-network modeling. The empirical data have been drawn from two open-access resources: PHOIBLE (Phonetics Information Base and Lexicon) and the r12a database (r12a.github.io). After standardization and tokenization, the datasets have been processed using Python-based AI modules to extract frequency distributions, identify clusters and detect structural patterns. The analytical workflow has followed a clear, reproducible sequence of steps informed by PRISMA principles, ensuring transparency and methodological rigor. Originality/Relevance: the paper brings together corpus linguistics, interlinguistics and artificial intelligence to propose a data-driven approach for identifying shared phonological and orthographic patterns across languages. By combining extensive linguistic datasets with computational techniques, the study demonstrates the potential of AI to support the creation of sustainable knowledge infrastructures and to promote more inclusive forms of digital communication — domains that are becoming central to innovation and strategic growth in the humanities. Main conclusions: the analysis revealed a relatively small set of phonemes and grapheme correspondences that recur across a wide range of the world’s languages. These results offer empirical support for developing streamlined, accessible alphabetic systems and for designing universal auxiliary language models. The study further shows that AI-supported modeling can improve linguistic inclusivity and analytical precision, especially in low-resource and multilingual settings, while still relying on the interpretive judgement of human specialists. Theoretical/methodological contributions: the research contributes to interlinguistics by bringing together the concept of language universals and contemporary AI techniques. It outlines a reproducible pathway for connecting empirical linguistic data with computational tools and theoretical interpretation. In doing so, the study supports the sustainable development of language technologies and enriches our understanding of how human expertise and artificial intelligence can work together to strengthen global communication. Practical implications: identifying a universal phoneme core and stable sound-script correspondences can streamline multilingual analytical workflows, lessen structural biases toward non-Latin scripts and lower the overall costs of integrating low-resource languages into sustainable and reproducible knowledge systems.

Referências

Alaqlobi, O., Alduais, A., Qasem, F., & Alasmari, M. (2024). Artificial intelligence in applied linguistics: A content analysis and future prospects. Cogent Arts & Humanities, 11(1), 2382422. https://doi.org/10.1080/23311983.2024.2382422

Anderson, C., Tresoldi, T., Greenhill, S. J., Forkel, R., Gray, R., & List, J.-M. (2023). Variation in phoneme inventories: Quantifying the problem and improving comparability. Journal of Language Evolution, 8(2), 149-168. https://doi.org/10.1093/jole/lzad011

Baudouin de Courtenay, I. A. (1963). Vspomogatel’nyi mezhdunarodnyi yazyk [International auxiliary language]. In I. A. Baudouin de Courtenay, Izbrannye trudy po obshchemu yazykoznaniyu v 2 tomakh [Selected works on general linguistics in 2 volumes] (Vol. 2, pp. 144-160). Moscow: Izd-vo AN SSSR. (In Russian)

Cheng, S., Zhu, P., Liu, J., & Wang, Z. (2024). A survey of grapheme-to-phoneme conversion methods. Applied Sciences, 14(24), 11790. https://doi.org/10.3390/app142411790

Doucette, A., O'Donnell, T. J., Sonderegger, M., & Goad, H. (2024). Investigating the universality of consonant and vowel co-occurrence restrictions. Glossa: A Journal of General Linguistics, 9(1), 1-39. https://doi.org/10.16995/glossa.9373

Groenewald, E. S., Pallavi, P., Rani, S., Singla, P., Howard, E. M., & Groenewald, C. A. (2024). Artificial intelligence in linguistics research: Applications in language acquisition and analysis. Naturalista Campano, 28(1), 1253-1262. https://www.researchgate.net/publication/379239839

Hair, J. F., & Sabol, M. (2024). Leveraging artificial intelligence (AI) in competitive intelligence (CI) research. Journal of Sustainable Competitive Intelligence, 15(00), e0469. https://doi.org/10.24883/eagleSustainable.v15i.469

Ishida, R. (Ed.). (n.d.). r12a Scripts & Writing Systems App. World Wide Web Consortium (W3C). Available at: https://r12a.github.io/scripts/switch.html

Jespersen, O. (1928). An international language. London: Allen and Unwin.

Lammers, S., & Lasch, A. (2023). Linguistic framing of artificial intelligence: What language to use when talking about artificial intelligence. Chemie Ingenieur Technik, 95(7), 1012-1017. https://doi.org/10.1002/cite.202200226

Martinet, A. (1967). Les langues dans le monde de demain. Paris: Presses Universitaires de France.

Meillet, A. (1918). Les langues dans l’Europe nouvelle. Paris: Payot.

Micallef, L. O. (2025). Lingvistika neyrosetey kak paradigma sovremennoy nauki o yazyke [Neural network linguistics as a paradigm of modern language science]. World of Science, Culture, and Education, 1(110), 467-473. https://doi.org/10.24412/1991-5497-2025-1110-467-469

Micallef, L. O., & Yasnenko, I. P. (2024). Principles of international auxiliary languages creation on the base of essential and artificial languages. Macrosociolinguistics and Minority Languages, 2(1), 50-65. https://doi.org/10.22363/2949-5997-2024-2-1-50-65

Moran, S., & McCloy, D. (Eds.). (2019). PHOIBLE: Phonetics Information Base and Lexicon. Jena: Max Planck Institute for the Science of Human History. Available at: https://phoible.org

Pussaignolli de Paula, M., Noronha, M., Garcia Valente, U., Inacio Domingues, B. R., & Jahn Souza, L. (2024). Mapping of artificial intelligence and robotics technologies applied to offshore wind Energy. Journal of Sustainable Competitive Intelligence, 15(00), e0474. https://doi.org/10.24883/eagleSustainable.v15i.474

Saussure, R. de. (1918). La structure logique des mots dans les langues naturelles, considérée au point de vue de son application aux langues artificielles. Berne: Büchler.

Wu, S., Ponti, E. M., & Cotterell, R. (2021). Differentiable generative phonology [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2102.05717

Yang, B. (2025). Frequency distributions and phoneme associations in PHOIBLE. Proceedings of Speech Sciences, 17(3), 23-37.

Downloads

Publicado

2026-03-02

Como Citar

Micallef, L. (2026). ARTIFICIAL INTELLIGENCE IN LINGUISTICS: MODELING UNIVERSAL PHONOLOGICAL SYSTEMS FOR SUSTAINABLE COMMUNICATION. Veredas Do Direito , 23, e235126. https://doi.org/10.18623/rvd.v23.5126