Generative AI for library and information professionals (draft)
Produced by the IFLA AI SIG, https://www.ifla.org/units/ai/
In compiling this resource we are seeking to provide a useful non-technical resource for information library and information professionals. We try to point to authoritative sources which are open to all.
If you have any comments or suggestions for improvement please email them to a.m.cox@sheffield.ac.uk
1. Introduction to generative AI
Generative AI are systems that can produce new text, images or other media.
They could be differentiated from descriptive AI which focuses on improving access to content such as text, images, audio and video by identifying features in them to enhance search.
Resources:
Ortiz, S. (2023). What is generative AI and why is it so popular? Here’s everything you need to know. ZDNet, https://www.zdnet.com/article/what-is-generative-ai-and-why-is-it-so-popular-heres-everything-you-need-to-know/
UNESCO quick start guide to Chat GPT in Higher Education, https://www.iesalc.unesco.org/wp-content/uploads/2023/04/ChatGPT-and-Artificial-Intelligence-in-higher-education-Quick-Start-guide_EN_FINAL.pdf
1.1 Examples of generative AI
GPT has been around for a while. The launch of Chat GPT by OpenAI propelled this form of AI into the headlines probably because it made the use of GPT so user friendly through its chat interface.
- Chat GPT, https://openai.com/blog/chatgpt
- New Bing, https://www.bing.com/new
- BARD, https://bard.google.com/
- HuggingChat, https://huggingface.co/chat/
Image generators:
- Dall-E, https://openai.com/dall-e-2
- Midjourney,https://www.midjourney.com/
- Niji Journey, https://nijijourney.com/en/
- Stable Diffusion, https://stability.ai/stablediffusion
Other AI
- Hugging Face models, https://huggingface.co/models
- AI based tools are proliferating, including within the “research tool” space, eg scholarcy, research buddy, openread
- Futurepedia, https://www.futurepedia.io/
- There’s an AI for that, https://theresanaiforthat.com/
1.2 How generative AI works
Resources:
A guide to Generative AI terminology: https://blog.scottlogic.com/2023/06/01/generative-terminology.html
2 Ethical and information issues
Generative AI such as Chat GPT has potential benefits (see uses below), but reflections on the ethics of AI should be considered prior to use.
The following have been raised as issues with some versions of generative AI, such as Chat GPT:
- Makes biased statements because of biases in the training data and how the training data was curated, eg GPT has been shown to be biased about gender, race etc (Webb, 2023).
- “Hallucinates” information which is inaccurate, feeding the flow of misinformation – fails to acknowledge sources and often fabricates citations – is not itself citable because it does not currently generate a consistent answer.
- Will accelerate the content creation explosion – leading to even more challenges of information overload.
- Fails to be explainable because it is far from open about what data it is based on or how it works (Burruss, 2020).
- Privacy is at risk if you share your data with it – many companies blocking use due to fear of loss of data. Students at many institutions are being advised to not put any personal data into bots.
- Violates copyright by using text and data from the open Internet as training data without permission and creates content heavily based on mined content (Mahari, Fjeld and Epstein, 2023).
- Threatens jobs, eg of journalists, editors and those in marketing. Axel Springer has already announced that they will replace some journalists with bots.
- Is available to people with money to subscribe, so disadvantages those without
- Was developed by exploiting very low paid Kenyan workers to detoxify content (Perrigo, 2023).
- May not be environmentally sustainable: GPT models are known to use a lot of computing power (Ludvigsen, 2022).
- Reveals the disruptive power in the hands of big Tech companies.
The balance of importance of these factors may vary between context, eg between higher education and universities (where the impact on academic integrity is central to debate) or corporate research (where it is the inaccuracy of information that is critical). In some contexts it may be possible to ban some forms of generative AI or procure a localised system. For example, it is possible to run some open source AI models locally via Python or R without uploading private documents to the cloud.
Fundamentally, although generative artificial intelligence has enormous potential for innovation and undeniably has significantly more knowledge than any individual human, it lacks is the ability to reason, consciousness and some of the most advanced human qualities.
Concerns raised by Chat GPT, among other factors, have re-energised plans to regulate AI, notably the planned EU AI act (The Artificial Intelligence Act ). It is reported that (https://www.europarl.europa.eu/news/en/headlines/society/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence ) “Generative AI, like ChatGPT, would have to comply with transparency requirements:
- Disclosing that the content was generated by AI
- Designing the model to prevent it from generating illegal content
- Publishing summaries of copyrighted data used for training”
Bommasani et al. (2003) weigh up if Foundation model providers comply with the draft EU AI Act.
Resources:
AIAAIC. 2023. “ChatGPT chatbot.” https://www.aiaaic.org/aiaaic-repository/ai-and-algorithmic-incidents-and-controversies/chatgpt-chatbot
AIID. 2023. “AI Incident Database”. https://incidentdatabase.ai/?lang=en
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021, March). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency (pp. 610-623). https://doi.org/10.1145/3442188.3445922
Bommasani, R., Klyman, K., Zhang, D. and Liang, P. (2023) Do Foundation Model Providers Comply with the Draft EU AI Act? https://crfm.stanford.edu/2023/06/15/eu-ai-act.html
Burruss, M. 2020. “The (Un)ethical Story of GPT-3: OpenAI’s Million Dollar Model” Last updated 27 July, 2020. https://matthewpburruss.com/post/the-unethical-story-of-gpt-3-openais-million-dollar-model/
Floridi, L. (2023). AI as Agency Without Intelligence: On ChatGPT, Large Language Models, and Other Generative Models. Philosophy and Technology, 2023, Available at http://dx.doi.org/10.2139/ssrn.4358789
Ludvigsen, K. 2022. The carbon footprint of Chat GPT. Last updated December 21, 2022. https://towardsdatascience.com/the-carbon-footprint-of-chatgpt-66932314627d
Mahari, R., Fjeld, J. and Epstein, Z (2023). Generative AI is a minefield for copyright law. The Conversation, https://theconversation.com/generative-ai-is-a-minefield-for-copyright-law-207473
Perrigo, B. (2023). “Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic.” Time, January 18, 2023. https://time.com/6247678/openai-chatgpt-kenya-workers/
Webb, M. (2023) Exploring the potential for bias in Chat GPT JISC blog, https://nationalcentreforai.jiscinvolve.org/wp/2023/01/26/exploring-the-potential-for-bias-in-chatgpt/
3 Uses
3.1 Uses for information professionals
Chat GPT can be used in one of its current versions for
- Summarisation of texts, eg lay summaries of academic papers
- Generating draft metadata to describe material
- General uses such as drafting documents and communications, eg policy documents, targeted marketing
3.2 Guiding end users to safe uses
This section summarizes how an information literate user should be trained to approach generative AI tools: to consciously evaluate how to use them effectively and to critically understand the wider context of how platforms work to shape information experience. It is presented in a form as a set of prompts.
- Learn how to use it effectively, by experiment and reading reviews
- How should we conceive of this tool? Eg as a clever writing assistant or a single point of truth?
- Understand what tasks it might help with, eg brainstorming, drafting, editing, writing in different styles, summarisation
- Is it trustworthy as a source of information: is the information supplied accurate and sources given?
- Are there systematic inaccuracies in the material it produces, ie biases?
- How can questions be formulated to get the best answer, eg
- Define style/ audience
- Repeat the request and synthesize answers
- Ask for sources that can be checked
- Are there alternatives that might be better for certain tasks?
- Keep on learning: the tools are evolving rapidly
- Use it to improve how you learn and be reflective about how you are using it
- Is it helping to improve your learning or just making things too easy?
- How does using the tool make you feel?
- Is it making you feel less connected to people?
- Protect your own privacy
- What types of information is it safe to share with it?
- Ask who owns, develops and profits from it and the wider related system of information discovery on the platforms you use
- Is it owned by commercially driven corporations so that use feeds their power?
- Is it open about how it works?
- Is recommendation actually narrowing access to information as a form of filter bubble?
- Was it created exploitatively eg by using low paid labour OR by mining information on the open web without permission?
- Does it have a negative environmental impact?
- Does everyone have equal access or is using it gaining an unfair advantage?
- Use it ethically, acknowledging how it is used:
- Are you permitted to use it in this context, eg under what conditions (if any) does your institution permit its use?
- Acknowledge it’s use appropriately for the context, eg there is APA citation guide https://apastyle.apa.org/blog/how-to-cite-chatgpt
Resources:
Example of a libguide to AI from University of Bolton https://libguides.bolton.ac.uk/ai
3.3 Guiding researcher end users
It remains unclear what uses of generative AI will be determined to be legitimate. There are many open questions about how Generative AI could be used in science (Birhane et al. 2023).
Questions that will be important to researchers include:
- What uses of AI in the research process are permitted? Eg for transcription, simulation of data or writing papers
- Which journals/ publishers allow which sorts of uses?
Resources:
Birhane, A., Kasirzadeh, A., Leslie, D., & Wachter, S. (2023). Science in the age of large language models. Nature Reviews Physics, 1-4.
4. Wider AI resources
For our earlier listing of AI resources see 23 resources to get up to speed on AI in 2023, https://www.ifla.org/23-resources-to-get-up-to-speed-on-ai-in-2023/
Some additional resources:
- Distributed AI Research Institute, https://www.dair-institute.org/about
- Global Partnership on Artificial Intelligence (GPAI), https://gpai.ai/
- International research centre on Artificial Intelligence under the auspices of UNESCO, https://ircai.org/
- Turing institute, https://www.turing.ac.uk/
About this document
Created by IFLA AI SIG
Version 30 06 2023