A HYBRID INTELLIGENCE FRAMEWORK FOR AI-ASSISTED QUESTIONNAIRE VALIDATION: INTEGRATING TEXT MINING AND LARGE LANGUAGE MODELS FOR CONSTRUCT VALIDITY

Authors

  • Sam Lubbe
  • Henry Mynhardt

DOI:

https://doi.org/10.18623/rvd.v23.5234

Keywords:

Artificial Intelligence, Computational Social Science, Hybrid Intelligence, Natural Language Processing, Questionnaire Validation, Survey Methodology

Abstract

Every time new survey results are made available to researchers understanding grows — yet sorting through it all gets tougher. Though direct responses from individuals are valuable, many scholars skip them, relying on less accurate measures. A strategy is missing, one from raw feedback to smarter questionnaires. This study introduces something called the Hybrid Intelligence Framework — based on a variety of methods: word patterns, topic traces, concealed meanings behind words. Flaws hide in questions—confusing wording, steep difficulty—and this approach drags them into view. Based on how minds handle strain in their thinking and match messages in meaning, the idea takes on shape. Machines help, yes, but they never take full control; people remain central, guiding every insight along with the code. When researchers have a way to use machines for reading text, we gain social science clarity plus trust. Because this setup guides computer use, methods become clearer while remaining out in the open. When people fill out surveys, they begin to mold the tools that are supposed to capture their answers—making those tools more effective. Clarity increases when steps are made transparent, not hidden behind guesswork. Machines assist, but only if the process listens to real voices. Good research means building on what’s said, not just what’s assumed.

References

Allen, N. J., & Meyer, J. P. (1990). The measurement and antecedents of affective, continuance and normative commitment to the organization. Journal of Occupational Psychology, 63(1), 1–18.

Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2019). Power to the people: The role of humans in interactive machine learning. AI Magazine, 40(4), 8–20.

Barrie, C., Palmer, B., & Spirling, A. (2024). Large language models for social science data collection and analysis. Political Analysis. Advance online publication.

Benoit, K. (2020). Text as data: An overview of applications in the social sciences. Annual Review of Political Science, 23, 311–332.

Birkenmaier, J., Gahn, S., & Sun, Y. (2024). AI-enhanced survey design: Reducing cognitive load through semantic analysis. Journal of Computational Social Science. Advance online publication.

Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.

Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061–1071.

Braun, V., Clarke, V., Boulton, E., Davey, L., & McEvoy, C. (2021). The online survey as a qualitative research tool. International Journal of Social Research Methodology, 24(6), 641–654.

Converse, J. M., & Presser, S. (1986). Survey questions: Handcrafting the standardized questionnaire. Sage.

Costa, A. P., Bryda, G., Christou, P. A., & Kasperiuniene, J. (2025). AI as a co-researcher in the qualitative research workflow: Transforming human-AI collaboration. International Journal of Social Research Methodology. Advance online publication.

Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302.

Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P., & Billiet, J. (2014). Measurement equivalence in cross-national research. Annual Review of Sociology, 40, 55–75.

Dellermann, D., Calma, A., Lipusch, N., Popp, K., & Reck, F. (2019). The future of co-creation: Co-creating value with artificial intelligence. Journal of Service Management, 30(3), 324–347.

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

Eckman, S., Chew, S., & Richards, S. (2024). Text mining for survey quality: Current practices and future directions. Survey Research Methods. Advance online publication.

Gilardi, F., Alizadeh, M., & Kubli, M. (2023). ChatGPT outperforms crowd-workers for text-annotation tasks. Proceedings of the National Academy of Sciences, 120(30), e2305016120.

Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press.

Haibe-Kains, B., Adam, G. A., Hosny, A., Khodakarami, F., Moloshok, M., Hou, V., ... & Gevaert, O. (2020). Transparency and reproducibility in artificial intelligence. Nature, 586(7829), E14–E16.

Harkness, J. A., Van de Vijver, F. J. R., & Mohler, P. P. (2010). Cross-cultural survey methods. Wiley.

Hodeghatta, U. R., & Sahney, S. (2024). Text mining and analysis: Practical methods, examples, and case studies using Python. Academic Press.

IdSurvey. (2025). Ethics and limitations of artificial intelligence in surveys. https://www.idsurvey.com

Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., ... & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), 1–38.

Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5(3), 213–236.

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems, 33, 9459–9474.

Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. MIT Press.

Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749.

Mittelstadt, B., Russell, C., & Wachter, S. (2023). Large language models pose risk to science with false answers. Nature Human Behaviour, 7, 1833–1835.

Presser, S., Couper, M. P., Lessler, J. T., Martin, E., Martin, J., Rothgeb, J. M., & Singer, E. (2004). Methods for testing and evaluating survey questions. Public Opinion Quarterly, 68(1), 109–130.

Roberts, M. E., Stewart, B. M., & Tingley, D. (2014). Structural topic models for open-ended survey responses. American Journal of Political Science, 58(4), 1064–1082.

Rothschild, D., Goel, S., & Konitzer, T. (2024). The future of survey research: Integrating computational methods. Cambridge University Press.

Schuman, H., & Presser, S. (1981). Questions and answers in attitude surveys: Experiments on question form, wording, and context. Academic Press.

Shah, N., Gupta, R., Müller, K., & Zhao, L. (2025). Semantic embeddings for survey analysis: New frontiers in computational social science. Journal of Computational Social Science. Advance online publication. https://doi.org/10.1007/s42001-025-00123-x

Sudman, S., Bradburn, N. M., & Schwarz, N. (1996). Thinking about answers: The application of cognitive processes to survey methodology. Jossey-Bass.

Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285.

Tourangeau, R., Rips, L. J., & Rasinski, K. (2000). The psychology of survey response. Cambridge University Press.

TrustCloud. (2026). Ethical AI & data privacy best practices: Governance guide for 2026. https://www.trustcloud.ai

Venkatesh, V., Thong, J. Y. L., & Xu, X. (2012). Consumer acceptance and use of information technology: Extending the unified theory of acceptance and use of technology. MIS Quarterly, 36(1), 157–178.

Weidinger, L., Uesato, J., Rauh, M., Anderson, C., Mahowald, K., Raunak, V., & Kenton, Z. (2022). Taxonomy of risks posed by language models. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 243–251. https://doi.org/10.1145/3531146.3533088

Willis, G. B. (2005). Cognitive interviewing: A tool for improving questionnaire design. Sage.

WitnessAI. (2025). Human-in-the-loop AI: Benefits, use cases, and best practices. https://witness.ai

Wonders AI. (2025). Complete guide to AI-assisted research ethics. https://www.readwonders.com

Downloads

Published

2026-05-20

How to Cite

Lubbe, S., & Mynhardt, H. (2026). A HYBRID INTELLIGENCE FRAMEWORK FOR AI-ASSISTED QUESTIONNAIRE VALIDATION: INTEGRATING TEXT MINING AND LARGE LANGUAGE MODELS FOR CONSTRUCT VALIDITY. Veredas Do Direito, 23(8), e235234. https://doi.org/10.18623/rvd.v23.5234