Writing, creativity, and artificial intelligence. ChatGPT in the university context


Abstract

The main objective of the research is to study the creative potential of Artificial Intelligence (AI) for writing skills in an educational context. The research aims to provide evidence on the use of AI and contribute to its integration in the classroom as a support for the teaching-learning process. Two types of research designs were established: a descriptive and comparative non-experimental quantitative research, and a quasi-experimental pretest-posttest study. The sample consisted of 20 AI systems and 193 university students who were given Games 2 and 3 of the Spanish PIC-A test (“Creative Imagination Test for Adults”). The students repeated the games, assisted by ChatGPT, to compare the possible improvement of their productions. The findings reveal statistically significant differences between the AIs and the students in the indicators of fluency, flexibility, and narrative originality in Game 2. Furthermore, significant differences are found between students’ pre-test and post-test scores in fluency, flexibility, and narrative originality in Game 2 and in fluency in Game 3. Finally, the assistance provided by AI in writing tasks and verbal creativity is highlighted, and this should be considered in language teaching; in any case, AI cannot replace human intelligence and creativity.

Keywords

Artificial intelligence, writing, language teaching, verbal creativity, ChatGPT, Large Language Models

Palabras clave

Inteligencia artificial, escritura, enseñanza de lenguas, creatividad verbal, ChatGPT, modelos de lenguaje extensivos

Resumen

La investigación persigue estudiar las posibilidades creativas de los sistemas de Inteligencia Artificial (IA) para el desarrollo de la escritura en el contexto educativo. Se persigue aportar evidencias en el uso de la IA y contribuir al conocimiento de su integración en las aulas como apoyo al proceso de enseñanza-aprendizaje. Se establecen dos tipos de diseño: una investigación de corte cuantitativo no experimental de tipo descriptivo y comparativo, y un estudio cuasi-experimental de tipo pretest-postest. La muestra estuvo compuesta por 20 sistemas de IA y 193 estudiantes universitarios, a los cuales se les aplicaron los juegos 2 y 3 del test español PIC-A («Prueba de Imaginación Creativa para Adultos»). El alumnado repitió los juegos con ayuda de ChatGPT, con el fin de comparar la posible mejora de sus producciones. Los resultados destacan la existencia de diferencias estadísticamente significativas entre las IA y el alumnado en los indicadores de fluidez, flexibilidad y originalidad narrativa del juego 2. Además, se encuentran diferencias significativas entre las puntuaciones del pretest y postest del alumnado en fluidez, flexibilidad y originalidad narrativa del juego 2, así como en fluidez del juego 3. Finalmente, se pone de manifiesto la ayuda que la IA proporciona en tareas de escritura y creatividad verbal, lo que debería ser tenido en cuenta en la enseñanza de lenguas; en cualquier caso, la IA no puede reemplazar a la inteligencia y la creatividad humana.

Keywords

Artificial intelligence, writing, language teaching, verbal creativity, ChatGPT, Large Language Models

Palabras clave

Inteligencia artificial, escritura, enseñanza de lenguas, creatividad verbal, ChatGPT, modelos de lenguaje extensivos

Introduction

The rapid development of Artificial Intelligence (AI) is a reality that involves multiple opportunities, risks, and challenges in the field of education, which have so far outpaced policies and legislative frameworks. In this regard, UNESCO (2019) committed to harnessing the potential of AI technologies in order to advance towards Sustainable Development Goal 4 (ensure inclusive and equitable quality education and promote lifelong learning opportunities for all) and achieve the Education 2030 Agenda. AI originally emerged as a tool to simulate and mechanise human thought processes (Turing, 1950). Today, it has become the basic grammar of our century that aims to develop citizens’ AI literacy and competencies (UNESCO, 2021). Moreover, UNESCO aims to achieve a human-centred approach, based on principles of inclusion and equity, in order not to widen technological gaps and to ensure “AI for all” in terms of innovation and knowledge. In this sense, one of the most important challenges is to ensure that AI is designed and used in an ethical and responsible way, in order to avoid misuse of technology or the increase of existing inequalities in society (UNESCO, 2022a).

Recently, the OECD (2021) considers the ways in which smart technologies are changing classroom education and the management of educational organisations and systems. Nowadays, there are already various ways in which AI can be involved in teaching, from OER (Open Educational Resources) content recommendation, student emotion detection, intelligent tutoring systems, AI-powered teaching assistants, to automatic exam marking and automatic forum monitoring (Flores-Vivar & García-Peñalvo, 2023).

There are several AI systems that are used to generate text automatically (“Large Language Model” or LLM). Their origin dates back to 2017, when the architecture of Transformer Models (Vaswani et al., 2017), neural networks that can learn through attention mechanisms, was presented. Experiments on two machine translation tasks showed that these models are highly efficient in terms of output quality and lower training. This model can learn context through sequential data, which is considered by experts to be the beginning of LLMs. Subsequently, in 2018, Google launched a research project based on Natural Language Processing (NLP), whose technology sought the interaction between human and computer dialects, bringing together the disciplines of Applied Linguistics, Computer Science, and AI. Google presented BERT (“Bidirectional Encoder Representations from Transformers”); a state-of-the-art NLP system that would help search work through its well-known search engine. Thus, LLMs are deep learningalgorithmsm that can recognise, summarise, translate, predict, and generate text or other content based on knowledge acquired from massive data sets. This learning is unsupervised, as it feeds a given amount of data to an AI without explicit instructions on what to do with it. Among the various applications of this large language model (Lee, 2023) are ChatGPT and other systems, which have been trained to answer questions or follow specific writing instructions.

These AI systems can write in a particular tone (humorous, familiar, professional, witty, friendly), rewrite or paraphrase sections of a given text, write after a title, or write in Shakespearean style. They represent an important challenge for language teaching and, in particular, for work on written expression and the development of creative writing. Creativity is a capacity of the human mind (Csikszentmihalyi,1998; Guilford, 1950; Sternberg, 1999), but, as Boden (2004) pointed out, computers and creativity can be interesting partners in two ways: for understanding human creativity and for producing computational creativity. This author explains creativity from a scientific approach that uses computational concepts from the field of AI, concepts that allow us to create and test hypotheses about the structures and processes that may be involved in human thought.

With respect to computational creativity, it should be noted that humans play a fundamental role in programming, choosing models, and fine-tuning AI systems. However, if we consider Boden’s (2004) three types of creativity (combinatorial, exploratory, and transformational), computers can generate ideas that at least appear to be creative. Moreover, he argues that computers can produce new ideas and help people to do so; both their failures and their successes make it possible to think more clearly about the creative power of human beings. Miller (2019) also questions whether AI can be creative, based on a review of various applications in the fields of visual arts, music, text, and musical theatre. He identifies the essential factors for the creative process, from the need for introspection to the ability to discover the key problem, and concludes that computers can be as creative as humans. On the other hand, Ward (2020) suggests that computational creativity is not (and does not need to be) equivalent to human creativity but could be different and even contribute to new processes and outcomes that could be considered creative; machines do not need to be more like humans, but humans could recognise the creative capabilities inherent in the machine. This is therefore the interesting direction that the new era of AI could take for the development of teaching and learning processes.

In this sense, language teaching should not avoid AI systems, considering the growing expansion of digital environments for writing and the significant presence of such environments in educational contexts. Writing, AI, and creativity become fundamental aims for new models of language teaching. Moreover, creativity cannot be separated from culture and the use or learning of a language (Argondizzo, 2012). Along these lines, creative thinking has taken on an important educational dimension as it has been included as a new assessable competence in PISA (OECD, 2019); this programme assesses the area of creative expression (where the domain of both written and visual expression is located), in addition to the area of knowledge creation and creative problem-solving finding creativity in artistic and inventive practices, this is an effeproblem-solvingproblem solving in all types of challenging situations (Rodrigo-Martín et al., 2022), which is why it is advisable to develop it in the educational environment. Moreover, creative thinking is identified as one of the prerequisite competencies which should be promoted by Member States for AI education, according to UNESCO (2022a).

Considering all the above, chatbots and the communicative interaction enabled by AI systems pose a challenge to the teaching practice of language teachers. Several studies have shown that the integration of AI improves the quality and effectiveness of foreign language teaching, as it favours an individualised and cooperative learning style (Sun et al., 2021; Yanhua, 2020). In this sense, given the variety of tools, programmes, and resources available to learners, it is crucial to study their possibilities, limitations, and ways of use, both for the benefit of adequate acquisition of the necessary competencies and for ethical use. Moreover, in the educational environment, AI systems must be subject to strict requirements for monitoring, assessing learners’ skill,s or predicting their behaviour. AI must support the learning process without reducing cognitive abilities. Furthermore, the information collected by the learner in their interactions with AI systems shall not be subject to illicit use, misappropriation, or criminal exploitation (UNESCO, 2022a).

Several initiatives have been addressing its potential educational use. For example, UNESCO (n.d.) is developing the portal “Teaching AI for K-12” (students aged 5-18), which includes resources to be used by teachers. In addition, in order to ensure that AI writing support tools use correct Spanish, the Royal Spanish Academy (RAE, n.d.) has led the project “Lengua Española e Inteligencia Artificial” (LEIA in Spanish) [Spanish Language and Artificial Intelligence]. It has also developed the MarIA project (Gutiérrez-Fandiño et al., 2022), a family of large language models in Spanish, made available to industry and research, which uses the RoBERTa-base, RoBERTa-large, GPT2 and GPT2-large models for their construction. MarIA has been created to solve the existing conflict with languages in GPT, as most LLM developers have been built in English. In addition, AI is already being used as a co-worker in some companies, according to Montero (2023), who recently pointed out that OpenAI chat has been included in a design project he is working on. In education, AI is also beginning to be experimented with, as Hendrickson, a lecturer in Digital Media at the University of Leeds, has her students use both ChatGPT and other AI language models in their writing assignments to improve their academic writing, while also testing how the technology works, enabling students to critically reflect on and question this text-generating resource (Renbarger, 2023). Similarly, the present research aims to approach the use of AI in teaching classrooms, as a companion and support, in order to integrate its use and check what kind of support this can provide. In the absence of solid and proven evidence on the effectiveness and impact of AI on students’ academic achievement, research is needed in this area. Research needs to be carried out in a future that is yet to be defined, as AI is a very powerful technology and the challenge is to “discover ways to use it with meaningfulness and awareness” (Selwyn et al., 2022: 143).

Therefore, the general objective of the present research is aimed at studying the creative possibilities of AI systems for the development of writing in the educational context. This general objective is articulated in the following specific objectives:

  • To find out the level of creativity indicators (fluency, flexibility, and narrative originality) of AI systems and students, based on the sample of the Creative Imagination Test for Adults (PIC-A) by Artola et al. (2012).

  • To compare the creativity of the AI systems with that of the students, based on the indicators of fluency, flexibility, and narrative originality.

  • To compare the scores of the creativity indicators obtained by the students in the two phases of application of the tests, the first without any type of aid and the second using an AI system (ChatGPT).

Materials and methods

Design

Aiming at addressing the objectives formulated, two types of research design were established. Firstly, a descriptive and comparative non-experimental or ex post-facto quantitative research study was carried out, in which the level of creativity achieved by both the AI systems and the students was analysed, and the extent of the relationship between the creativity of the AIs and the students was studied.

Secondly, a quasi-experimental pretest-posttest study was carried out, where the creativity indicators obtained by the students, both before and after a learning intervention with the help of ChatGPT, were evaluated and compared.

Sample and participants

The sample consisted of 20 Large Language Model (LLM) systems with OpenAI/GPT-3 technology (except Dupla.ai, a Chrome extension), which have the function of automatically generating text from a given instruction (Table 1). In the selection of AI systems, the following inclusion criteria were taken into account: open access (free of charge or “Freemium”) and online execution without prior download. These criteria are explained by the intended use of AI in an educational context.

Similarly, 193 students (153 women and 40 men) aged 18 to 50 years old, enrolled in the Degree in Primary Education and the Master’s Degree in Research and Innovation in Early Childhood Education and Primary Education at the University of Murcia, the Degree in Fine Arts at the University of Salamanca and the double Degree in Audio-visual Communication and Journalism at the Miguel Hernández University of Elche participated in the study. A non-probabilistic, purposive, convenience sampling procedure was used.

https://typeset-prod-media-server.s3.amazonaws.com/article_uploads/7e1db807-81a3-4fd8-99a0-e29008ccc29a/image/3e8ca471-e51e-4f12-92b5-d98021b6609e-ueng-04-01.png

Tools

The instrument used was the Test of Creative Imagination for Adults (PIC-A) by Artola et al. (2012). Of the four games that make up the test, the second and third were used, as these are the ones that focus exclusively on verbal text. The instructions for the games are as follows:

  • Game 2. Make a list of all the things a rubber tube could be used for. Think of interesting and original things. Write down all the uses you could have for it, even if they are imagined. You can use any number and size of tubes you want. Example: “As a water pipe”.

  • Game 3. Imagine and answer what you think would happen if this sentence were true: What would happen if people never stopped growing? Example: “Stretchy clothes would sell much more”.

The indicators of creativity, assessed in both the second and third games, are as follows:

  • Fluency: the ability to produce a large number of ideas. A high fluency score corresponds to being able to make a large number of associations with a stimulus.

  • Flexibility: the ability to produce a wide variety of responses, related to different domains. A high flexibility score relates to the ability to search for solutions using a variety of alternatives and the ability to change perspective.

  • Narrative originality: the ability to produce ideas that are far from the obvious or established. A response is considered original when its frequency of occurrence is very low.

Finally, reliability was calculated using Cronbach’s alpha, which indicated that both game 2 (α=.803) and game 3 (α=.853) showed high internal consistency.

Procedure

Firstly, each of the AI systems was asked to perform the two sets of the PIC-A. The instruction for each test was typed into the chats and text boxes of the systems and the response provided was collected. Given the speed with which the systems generate text, instead of using the time set by the instrument, the information was requested on three continuous occasions, sending the same instruction each time. Subsequently, the automatic text generated was corrected, based on the indicators of fluency, flexibility, and narrative originality, according to the correction notebook established in the instrument (Artola et al., 2012). It should be noted that bearing in mind that the AI learns from previous interactions, the two PIC-A games were performed as the first action once the system was accessed, to ensure that no circumstance could have influenced its response.

As far as the research with the students is concerned, one-hour sessions were used in each of the groups. Firstly, in the pre-test phase, the students played the two creativity games, following the instructions and duration provided in the manual (10 minutes each). Secondly, the students registered in ChatGPT (the AI selected for its significant diffusion compared to others designed previously, its media impact, and the current debate on its use in the educational field), its functioning was explained, and they carried out test queries. Thirdly, the students performed the games again (post-test phase), this time with the help of the answers provided by ChatGPT; it was explained to them that they had to improve the content they had written when they took the test for the first time, so it was not a question of taking the tests again without taking into account what they had done previously. The learners could ask the chat what they considered in order to improve their text, either by asking for the test instruction literall or by asking other questions that could be aimed at improving their written productions.

Participants were informed about the confidentiality of the data and research objectives, following the ethical standards indicated in The Code of Good Practice on Research of the University of Murcia (2022), and any doubts that arose at the time of application were addressed.

Data analysis

The indicator scores for creativity were calculated according to the correction instructions and category system of Artola et al. (2012). The direct scores are calculated from the following indications: a) Fluency is obtained by counting the total number of different answers given; b) Flexibility refers to the number of different categories (according to the established scale) in which at least one answer is classified; c) Narrative originality is the result of multiplying the number of answers in each category by the coefficient given to that category and their total sum. The determination of high or low creativity is made on the basis of the table of averages of the indicators in the different games, provided by the instrument manual (Artola et al., 2012: 86).

Once the correction had been made, the descriptive statistics of the different variables were analysed in order to find out the level of creativity of the AI systems and of the students, according to the first specific objective. The Kolmogorov-Smirnov normality test was then applied to determine the relevance of the use of parametric tests. This test showed that fluency (p<.05), flexibility (p<.05), and narrative originality (p<.05) did not follow a normal distribution. Thus, in order to compare the creativity of the AI systems and the students, as pursued by the second specific objective, the Mann-Whitney U test was performed. Finally, to answer the third specific objective, the Wilcoxon test was performed.

Results

The results are presented below according to the specific objectives formulated. The first specific objective focused on finding out the level of the creativity indicators of the AI systems and the students. A descriptive statistical analysis was carried out of the scores obtained after correcting the games played by the AI systems and the students in terms of fluency, flexibility, and narrative originality. Firstly, with respect to the AI systems (Table 2), it was observed that the means obtained in game 2 are higher than the means indicated in the instrument manual for the indicators of fluency (M=15.54), flexibility (M=9.75) and narrative originality (M=11.15). However, in game 3, the means are lower than those given in the manual for fluency (M=13.78), flexibility (M=8.75) and narrative originality (M=8.12) (Artola et al., 2012: 86).

In game 2, none of the AIs falls below the averages for fluency and narrative originality indicated in the manual. In fluency, ASKtoAI (DS-direct score: 81), Youchat (DS: 65), Writesonic (PD: 41) and Dupla (PD: 40) stand out. In narrative originality, ASKtoAI (DS: 87), ChatGPT (DS: 85), Copy.AI (DS: 60), Unbounce (DS: 49) and Anyword (DS: 47) stand out. In flexibility, not all AIs exceed the manual average, although Dupla (DS: 29), Canva's Magic Write (DS: 19) and Anyword (DS: 18) should be mentioned.

In game 3, the AIs with high scores in fluency are Neuroflash (DS: 30), ASKtoAI (DS: 20) and Jasper (DS: 20). In flexibility, Neuroflash (DS: 20), Writesonic (DS: 11), ChatGPT (DS: 10), ASKtoAI (DS: 10) and Nichesss (DS: 10) stand out. In narrative originality, ChatGPT (DS: 25), ASKtoAI (DS: 22) and Neuroflash (DS: 18) scored highest.

https://typeset-prod-media-server.s3.amazonaws.com/article_uploads/7e1db807-81a3-4fd8-99a0-e29008ccc29a/image/cf0ea707-5527-414a-b3e6-556c83febe6e-ueng-04-02.png

Secondly, after performing the descriptive statistical analysis of the students’ scores (Table 3), it was discovered that the sample of the present study presented values below the means of fluency and flexibility of the two games, as well as below the means of narrative originality of game 2, as indicated in the manual of the Creative Imagination Test for Adults (Artola et al., 2012: 86).

https://typeset-prod-media-server.s3.amazonaws.com/article_uploads/7e1db807-81a3-4fd8-99a0-e29008ccc29a/image/97865409-a4a3-4ed0-acec-844fbd4f76cd-ueng-04-03.png

However, the mean of narrative originality of game 3 was above the mean indicated by the instrument referred to. If we look at the scores according to gender, women obtained a better mean in the narrative originality of both games, compared to that presented in the instrument's manual. In general, women showed better mean scores than men in all the creativity indicators in the sample of the present study.

For the second specific objective, in order to compare the creativity of the AI systems and the students, the Mann-Whitney U test was performed, which shows that there are statistically significant differences in the indicators of fluency, flexibility and originality in game 2 (Table 4).

https://typeset-prod-media-server.s3.amazonaws.com/article_uploads/7e1db807-81a3-4fd8-99a0-e29008ccc29a/image/a9ac694f-2897-4c18-967e-a0fdae1ea928-ueng-04-04.png

Moreover, no statistically significant differences were found in the indicators of game 3, indicating similar levels between the students (human intelligence) and the AI. However, it should be noted that students show a higher mean on the narrative originality indicator in game 3. Also, fluency and narrative originality in game 2 have moderate effect sizes, while the other indicators in game 2 and game 3 have low effect sizes.

Following this, to address the third specific objective, aimed at comparing the scores of the creativity indicators obtained by the students in the two phases of application of the tests, the Wilcoxon test was performed, which revealed the existence of statistically significant differences between the pre-test and post-test scores in fluency, flexibility, and originality in game 2, as well as in fluency in game 3 (Table 5). On the other hand, fluency, and narrative originality in game 2 show a moderate effect size, while both the flexibility indicator in game 2 and the three indicators in game 3 show a low effect size.

https://typeset-prod-media-server.s3.amazonaws.com/article_uploads/7e1db807-81a3-4fd8-99a0-e29008ccc29a/image/c4c3fb4f-32cf-4961-88ae-43f21309bc12-ueng-04-05.png

Discussion and conclusions

The present research aimed to approach the study of the creative possibilities of AI systems for the development of writing in the educational context. We have managed to provide didactic evidence on the use of AI and to contribute to the knowledge of its integration in classrooms as educational support. The role that AI systems can assume in the new technological framework, where the writing-creativity-AI trinomial can begin to be contemplated in the new models of language teaching, has been verified.

Firstly, it is noticeable that the means obtained by the AIs in the three indicators of game 2 are higher than the means indicated in the instrument manual, while, on the contrary, the means achieved in the indicators of game 3 are lower. This situation is related to the type of instruction requested in each of the games and highlights the difference between AI systems and humans. While game 2 explores the possible uses of a real object, the statement in game 3 starts from an implausible situation and aims to assess a fanciful aspect of imagination, a form of thinking that is fundamental to creative behaviour. Uncreative subjects are unable to conceive of such possibilities in this context and may clash when playing this game (Artola et al., 2012), as has happened to AI systems. Moreover, play involves insight into the experience, and while certain consequences are easy to discover, others require deepening of the subject matter and problems (Guilford, 1950). Csikszentmihalyi (1998) already argued that creativity does not happen in an isolated mind, but in the interaction of the person with a socio-cultural context, with established symbolic rules and a field of experts who recognise innovation. Also, taking into account that fantasy (compared to the invention, creativity, and imagination) is the activity that is presented as the freest, creativity being the sum of fantasy and invention (Munari, 2018), it is observed that AI responds more effectively to the invention, seeking practical solutions. In relation to these results, it is striking that there is an AI (pymetrics.ai), as described by Sadin (2018), dedicated to selecting personnel for companies, based on their creativity, adaptability, and flexibility. In this vein, Miller (2019) argues that to be truly creative, machines will need to enter the world. In relation to the students, as has been noted in other studies (Donolo & Elisondo, 2007), females have a higher level than males in the sample in all creativity indicators, as well as a higher level of narrative originality in both games in relation to the scale of averages in the instrument manual. In general, it can be stated that gender differences in creativity depend on social and cultural factors (Aleksić et al., 2016; Hora et al., 2022). In addition, after comparing the creativity of the AI systems with that of the students, the AI scores on the indicators in game 2 significantly outperformed the students’ scores, while in game 3 the creativity indicators showed similar levels in both groups. This result is consistent with the above, as AI scores were higher in game 2. AIs have a large amount of information, more than a person can handle and, as Goleman et al. (2016) stated, having accurate and abundant data is essential in the creative process, since the greater the knowledge of the details of a problem, the greater the probability of finding a solution. With respect to game 3, the students outperform, although not significantly, the AI in narrative originality. Humans do not possess the rigidity that AI systems have exhibited, not being able to distance themselves from the statement in game 3 and find answers that could be pigeonholed into surprising, new, non-obvious, and valuable categories (Boden, 2004; 2018).

Furthermore, the significant improvement in the student’s written productions after the use of ChatGPT has been highlighted, specifically in all the indicators of game 2 and the fluency of game 3. The potential of human-AI collaboration in the writing process is highlighted, as “AI is a means, not an end” (Breton, 2021) and it has been a tool that has worked as an assistant to the students in game 2, which is aligned with the productive possibilities of AI technology, whereas the flexibility and originality of game 3 has not, as it has already been shown that the AIs have not been able to imagine the effects of a fantasy situation. In addition to the explanations already offered, this result may also be due to what is called the “framing problem”, which refers to the fact that systems programmed for a specific purpose perform better, although they are not able to successfully perform another activity for which they are not programmed (Bonami et al., 2020).

In light of these findings, writing with computational assistance suggests a rethinking of how writing and creativity are conceived. There are questions and concerns with academic honesty and plagiarism, as reflected in recent scientific (Else, 2023; O’Connor and ChatGPT, 2023) and mainstream (Sánchez, 2023; Vázquez, 2023) publications; and indeed, in parallel to the growth of AIs, methods are being developed to detect texts written by them, such as “Classifier” (OpenAI, 2023). In any case, banning AI systems in education (Peirón, 2023) would be a losing battle, so instead of pretending that AI does not exist, it is time to train students to work with it; in the same way that tools such as ChatGPT are being used in science and several researchers confirm their use (Hutson, 2022). Teachers should therefore reflect on the skills they teach and how AI could help students in generating ideas and developing their creativity when dealing with writing tasks.

On the other hand, having proven the AI’s assistance in terms of verbal creativity, it must be pointed out that AI cannot replace human intelligence and creativity. Montero (2023) pointed out that AI lacks judgment and stated that creativity is not having ideas but knowing when not to have more. The machine lacks context and does not know what is good to be valued. Therefore, the active participation of students in their learning process should be encouraged and they should be encouraged to act decisively, judiciously, and responsibly. Collaborative human-machine intelligence is required, as highlighted by the Instituto Nacional de Tecnologías Educativas y de Formación del Profesorado [National Institute of Educational Technologies and Teacher Training] (INTEF, 2022), noting the complex relationship between people and AI, as there are specific functions that can only be performed by humans and for which they must be trained (Holmes et al., 2019). Along these lines, Fyfe’s study (2022), like the present research, asked university students to conduct an essay using AI. Initially, applying AI to writing may have seemed like a shortcut, but the artificially generated text was difficult to control, deviated from the topic and had to be revised in its different samples, which did not allow for the automatic integration of information in an extensive genre such as the essay.

On the basis of this study, further research in the field of writing is of interest, by extending the study of creativity with the help of AI by means of narrative composition tasks; along these lines, books on the use of ChatGPT to write and structure long texts are beginning to appear on the market (Gade, 2023). In addition, it is useful to broaden the human sample of participants to include other degrees or professional areas. Similarly, it is possible to envisage future research that delves deeper into the different domains of creativity, in line with the rapid advancement of AI in the creation of a wide variety of products. Thus, the present research can be completed with other studies focused on the creation of images or musical sequences using AI. In short, there is a need for more in-depth studies on the use and exploitation of AI and, specifically, ChatGPT in an educational context, as textbook publishers themselves are beginning to incorporate these resources into their content platforms, as is the case of Edelvives’ intelligent assistant that integrates ChatGPT (Edelvives, 2023). Lastly, the use of AI should be aimed at empowering teachers and not replacing or displacing them (UNESCO, 2022b). An AI Literacy Plan is required to train teachers in both technical skills and ethical-philosophical debates (Flores-Vivar & García-Peñalvo, 2023; UNESCO, 2022a). Similarly, changes in educational practice in the coming years will be determined by AI developments and will need to rely on research in full collaboration with teachers, educational leaders, and learners to ensure that appropriate educational policies are put in place (OECD, 2021). (1)