Overview
For better or worse, AI chatbots like ChatGPT are poised to change the ways we work and communicate. Exactly how they will do so, and how effective they are at doing so, remains an open question. The answers will become more clear as ChatGPT evolves alongside research into its effects.
Already, researchers are putting ChatGPT and other AI software to the test. Research studies are beginning to shed light into the promises and pitfalls of using AI chatbots for work, school, and life. Outcomes like productivity, writing quality, and creativity are already being studied and attracting attention.
In this article, we summarize the existing research around generative AI software, focusing on controlled trials (i.e., experiments), large sample sizes, and other studies with rigorous designs, with a particular emphasis on ChatGPT.
For better or worse, AI chatbots like ChatGPT are poised to change the ways we work and communicate. Exactly how they will do so, and how effective they are at doing so, remains an open question. The answers will become more clear as ChatGPT evolves alongside research into its effects.
Already, researchers are putting ChatGPT and other AI software to the test. Research studies are beginning to shed light into the promises and pitfalls of using AI chatbots for work, school, and life. Outcomes like productivity, writing quality, and creativity are already being studied and attracting attention.
In this article, we summarize the existing research around generative AI software, focusing on controlled trials (i.e., experiments), large sample sizes, and other studies with rigorous designs, with a particular emphasis on ChatGPT.
ChatGPT Usage
According to a study by Pew Research (2023), only 24% of Americans say they have ever used ChatGPT as of August 28, 2023. However, 41% of young people aged 18-29 have, along with about a third of people with college degrees.
Greater use by younger people likely indicates usage for school work (both legitimate and illegitimate) and may signal a long-term trend toward adoption of AI tools like ChatGPT, given young people’s tendency to adopt new technologies and continue using them. Greater use by those with college degrees indicates greater use in white collar jobs. As explained in the next section, these workplaces stand to benefit the most from ChatGPT.
Productivity & Quality
There's reason to be optimistic about ChatGPT in the workplace. A paper by researchers at MIT (Noy & Zhang, 2023) found that using ChatGPT across a variety of occupation-specific tasks reduced time taken by 0.8 standard deviations (a large effect) and increased quality by 0.4 standard deviations (a moderate effect). Much of these effects was from ChatGPT reducing the need for rough drafting, allowing for more time on idea generation and revising.
ChatGPT seems to help lower skilled workers most. While this helps reduce productivity inequalities in the workplace, it may also increase competition in the labor force for jobs that require strong writing skills. This increased competition could come from an increase in otherwise qualified applicants with fewer writing skills, as well as a direct decrease in jobs that ChatGPT makes obsolete. Nevertheless, workers in the MIT study experienced increased job satisfaction from using ChatGPT.
Creativity
Creative writing also seems to get a boost from using AI, albeit with a few caveats. According to a pre-registered experiment with 293 people, short story writers who used a generative AI tool to assist with the creative writing process earned significantly higher creativity scores from independent evaluators, relative to a human-only control group (Doshi & Hauser, 2023). Similar to the MIT study's results for productivity and quality, lower skilled (i.e., less creative) participants benefited most from using a generative AI tool.
However, although the creative writing outputs were perceived to be more creative when AI was used, the stories themselves were actually more similar to each other, based on textual analysis. The findings echo similar results obtained in a study of crowdsourced sustainability ideas (Boussioux, et al , 2023). Whereas AI-generated ideas were judged to be more financially and environmentally valuable, human-generated ideas were judged to be more novel. Similarly, in a peer-reviewed study using a divergent thinking task, AI-generated ideas outperformed humans on average, but the best human ideas were at least as good or better than AI's (Koivisto & Grassini, 2023).
Thus, it may be the case that AI increases creativity in the short-term, but reduces it in the long-term. Such an outcome could be exacerbated by overreliance on AI. It's also an open question how algorithmic updates could influence such effects.
Ethical Concerns
Research abounds on the ethical concerns and outcomes of using AI-generative tools like ChatGPT. For example, there is no shortage of research highlighting the very real problem of cheating and plagiarism from using ChatGPT and other generative AI tools. However, much of this research has been carried out by the organizations being disrupted by ChatGPT (i.e., tutoring organizations, AI detection firms, etc.). Given these conflicts of interest, one must consider the sources of such research, not just the research designs and findings.
Experts tend to agree that ChatGPT does enable corner cutting, from student essays to professors' publications. Research is mixed on our ability to spot such AI-generated works. For example, in a study dating back to 2021, participants incentivized to detect AI-generated text (i.e., a Turing Test) succeed in detecting poems that were generated by AI only when those poems were randomly selected; participants failed to detect AI-generated poems when another human chose the best AI-generated poem from a series of them (Köbis & Mossink, 2021).
Another major issue with ChatGPT is it's tendency to "hallucinate," or make up information or sources. In one study, scientists took 50 real abstracts from scientific journals and had ChatGPT produce a different set of 50 abstracts based on real journals and article titles. Human reviewers were able to identify 68% of the fake abstracts. However, they also falsely identified 28% of the original abstracts as AI-generated (Gao, et al. 2023).
Thus, an additional problem arises. Once AI-generated text and images proliferate, we may even start to view human-generated content with suspicion. Fortunately, Gao and colleagues also found that AI detection tools were able to discern almost all of the AI-generated abstracts from the real ones. AI detection tools are growing, but will likely remain locked in a constant battle with the AI algorithms themselves as they continue to update.
References
Boussioux, L., Lane, J. N., Zhang, M., Jacimovic, V., & Lakhani, K. R. (2023). The Crowdless Future? How Generative AI Is Shaping the Future of Human Crowdsourcing. Working Paper.
Doshi, A. R. and Hauser, O. (2023). Generative Artificial Intelligence Enhances Creativity but Reduces the Diversity of Novel Content. (August 8, 2023). Working Paper.
Gao, C.A., Howard, F.M., Markov, N.S. et al. (2023). Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers. npj Digital Medicine, 6, 75.
Köbis, N., & Mossink, L. D. (2021). Artificial intelligence versus Maya Angelou: Experimental evidence that people cannot differentiate AI-generated from human-written poetry. Computers in Human Behavior, 114, 106553.
Koivisto, M., Grassini, S. (2023). Best humans still outperform artificial intelligence in a creative divergent thinking task. Scientific Reports, 13, 13601.
Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. (March 2, 2023). Working Paper.
Pew Research Center. August 28, 2023. Most Americans haven’t used ChatGPT; few think it will have a major impact on their job.