António Pedro Costa, University of Aveiro (Portugal)
Researcher at the Center for Research in Didactics and Technology in Trainer Training (CIDTFF), Department of Education and Psychology, University of Aveiro and collaborator at the Laboratory of Artificial Intelligence and Computer Science (LIACC), Faculty of Engineering, University of Porto.
Reviewing articles, projects, and reports, among others, is one of the most relevant activities in science. Evaluation acts as peer approval, allowing for improving, altering, and deepening research projects and reports. Science evolves with this dynamic and interactions, which need to be constant. Will there be another way of doing science? On the other hand, the enormous number of scientific publications and international projects demands availability from the scientific community, often translated into extra time and work by teachers and researchers. For several years, the COST New Frontiers of Peer Review action discussed various dimensions of paper reviewing. Artificial intelligence was explored through a simulator that allowed measuring reviewers’ cognitive load (mental workload) during the article evaluation process. However, it was not anticipated that a machine could replace the reviewer. Are we close to being able to do without evaluators, or will we start a hybrid process, where, for example, on the list of members of a Scientific Committee of an event, Reviewer 1, Reviewer 2, and ChatGPT will appear?
ChatGPT (Chat Generative Pre-training Transformer) is an artificial intelligence chatbot developed by OpenAI specialising in dialogue. The focus of this text is to instigate reflection on the potential impact of Artificial Intelligence, more specifically ChatGPT, on scientific work, namely in evaluating articles. To do this, I share an abstract of which I am a co-author. This abstract was submitted to the 7th World Conference on Qualitative Research and was the subject of three evaluations (double-blind review). A fourth evaluation, carried out by ChatGPT, is presented here.
What does a “CAQDAS expect” from a user when building the Methodological Design? Introduction: The tools, like the Computer-Assisted Qualitative Data Analysis Software (CAQDAS), this impact research work, particular in certain areas, like Qualitative Research in Social Sciences. These tools allow users to either obtain analysis services or even go deeper into the methodologies themselves, making sense of the features shown and their outcomes. Technological tools to support the analysis of qualitative data are increasing exponentially. Additionally, these tools allow us to organise and edit sources, improve coding (inductive and deductive) and explore data triangulation in a previously impossible way. CAQDAS can be defined as computer tools that help researchers develop and improve the quality of qualitative studies. Goals and Methods: The study will be conducted by a Living Systematic Review (Tailored PRISMA 2020 protocol). The literature search was conducted in Scopus and Web of Science databases. We use the keywords “QDAS” OR “CAQDAS” and the most relevant CAQDAS (“ATLAS.ti” OR “Dedoose” OR “MAXQDA” OR “Nvivo” OR “QDA Miner” OR “webQDA”). We limited the search to the 2022 year, only papers published in the Social Sciences area and articles that use “Qualitative Research” keywords. With this, we analyse the methodological design of papers to understand the influence of a CAQDAS in the procedures and processes of qualitative studies. Results: The results indicate that improving the methodological design of the articles analysed is necessary, giving greater focus to the procedures followed. The Living Systematic Review and Content Analysis technique permit monitoring of the features of CAQDAS in a double sense: to seek to provide features to satisfy users’ needs but also to look for what a CAQDAS (developers) expect from a user.
The first question asked to ChatGPT was: what would you improve in the abstract?
ChatGPT: “I would suggest that the abstract include more information about the specific results of the systematic review and their implications for research in the social sciences. Additionally, including a brief context about CAQDAS and its importance in the analysis of qualitative data could be helpful for readers who are not familiar with the subject.”
Anything else?
ChatGPT: “Yes, I suggest that details about the inclusion and exclusion criteria used in the literature search and the final sample of articles included in the systematic review be added. This would allow readers to assess the results’ generalizability and limitations. Additionally, it would be helpful to mention any biases or limitations identified in the review and their potential impact on the results.”
What is the main limitation?
ChatGPT: “The main limitation mentioned in the abstract is that the systematic review was limited to articles published in 2022 and in the social sciences field that use keywords related to CAQDAS. This may restrict the included studies’ sample and affect the review results’ generalizability. Additionally, the limitation in terms of time may mean that some relevant and recent research was not included in the review.”
Anything else?
ChatGPT: “Furthermore, a critical evaluation of the methods used in the studies included in the systematic review and their consistency with good practices in qualitative research could be valuable. This would allow readers to assess the quality of the included studies and, consequently, the robustness of the review results. Additionally, comparing the review results with other relevant studies could provide more perspective and context on their relevance and importance.”
Can you recommend some literature?
ChatGPT: “I recommend that the author consider reading the following articles to improve the abstract:
- De Leeuw, E. D. (2008). “Handbook of Qualitative Research in Education.” San Diego, CA: Academic Press.
- Flick, U. (2018). “An Introduction to Qualitative Research.” Sage Publications Ltd.
- Creswell, J. W. (2014). “Research design: qualitative, quantitative, and mixed methods approaches.” Sage publications.
These books provide an in-depth introduction to qualitative research. They include information on the use of software for qualitative data analysis, the importance of having a solid methodological design, and critical evaluation of the quality of studies included in systematic reviews.”
You wrote about generalising, but is it relevant since it’s a qualitative study?
ChatGPT: “It is not common to generalise the results of qualitative studies to the general population, as qualitative research aims to understand the complexity and diversity of individual perspectives and experiences. Instead of generalising, qualitative research seeks to capture the richness and depth of data, often through detailed interviews, observations, and other subjective data collection methods. The goal is to describe and understand the perspectives and experiences of the individuals involved in the research, rather than applying general results to a wider population.”
So, do I need to worry about having an extensive sample to improve this abstract?
ChatGPT: “Correct. Having an extensive sample for this abstract is optional. The study aims to systematically review the literature to understand the influence of qualitative data analysis software (CAQDAS) on qualitative research methodologies in the social sciences. The sample of the systematic review will be limited to articles published in the social sciences field and using the keywords “Qualitative Research”. The systematic review will allow monitoring of the characteristics of CAQDAS and analysis of the methodological design of the articles to understand the influence of software on qualitative research.”
To justify this evaluation on your part, I need some references about ChatGPT.
ChatGPT: “OpenAI, a research-based artificial intelligence organisation, trained the ChatGPT language model. There is no single reference about ChatGPT, but several articles and studies have been published on Transformer-based language models, which are the underlying architecture of ChatGPT. Some examples include:
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I. (2017). Attention is All You Need. In Advances in Neural Information Processing Systems (pp. 5998-6008).
- Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
These articles provide an overview of the Transformer architecture and how it is used for natural language processing tasks. In addition, several studies evaluate the effectiveness of ChatGPT and other Transformer-based language models in solving specific tasks, such as question answering and text generation.”
Should I keep your evaluation anonymous?
ChatGPT: “As an artificial intelligence created by OpenAI, I do not have my own identity or personal preferences. Therefore, my evaluation is always anonymous.”
I propose you join the Scientific Committee of the World Conference on Qualitative Research
ChatGPT: “As an artificial intelligence, I cannot be a member of the Scientific Committee of the World Conference on Qualitative Research. However, I can provide information and answer questions about qualitative research and other related topics.”
So, how can I ethically safeguard the evaluation you just made of this abstract?
ChatGPT: “To safeguard my evaluation, you must cite it correctly as ‘evaluation performed by ChatGPT, a language model trained by OpenAI’ and mention that it is an evaluation performed by a language model, not by a human expert. Additionally, you must respect OpenAI’s privacy and copyright policies.”
This dialogue could continue for hours. The goal of this text is not to draw conclusions, issue considerations, or make any “judgment” but to encourage the reader to reflect. A few years ago, Springer and IEEE had to remove 120 “scientific” articles generated by Lero Lero. For those who are not familiar, Lero Lero is a content generator with artificial intelligence. ChatGPT was made available in late 2022. We seem not yet ready to deal with this “black box” type, in which most results are not predictive. In parallel, these algorithms interfere with our comfort zone and force us to think “outside the box.” The next few months promise to be very stimulating. Finally, could this text have been generated by ChatGPT?