AVMS Webinar: Enhancing Access to Audiovisual Resources with the AI Model Whisper

When

29 February 2024

Where

Online

Event will be recorded

Yes

Registration

Event time

17:00 pm - 18:30 pm CET

Recording

This webinar provides an in-depth exploration of Open AI’s Whisper, an advanced AI-powered speech recognition model renowned for its ability to transcribe and translate across an impressive range of 97 languages. Whisper’s exceptional capabilities stem from its training on a diverse and extensive dataset of audio recordings, amounting to 680,000 hours. The model’s proficiency lies in its deep neural networks, which handle complex speech patterns and contextual nuances with a high degree of accuracy. Notably, Whisper is released under an open-source MIT license, allowing its integration into other services.

We will demonstrate Whisper’s practical applications in digital library services, focusing on the TIB AV-Portal (av.tib.eu) and the Serbian portal zavicajna.digitalna.rs. We will show how Whisper enhances searchability and findability of content, improves linking of named entities, subtitles videos, and supports multilingual understanding and searching.

However, Whisper is not without its challenges. We will discuss ‘hallucinations’, where the AI generates incorrect or nonsensical responses. Furthermore, we will discuss the problem that less represented languages, such as Serbian, often yield suboptimal results. Strategies to mitigate these issues, such as filtering out transcripts with excessive repetitive loops and fine-tuning to enhance the accuracy of Serbian texts, will be explored.

Concluding the session, the webinar will introduce Subtitle Edit, an open-source video subtitle editor. This tool is adept at creating transcripts using both the official and fine-tuned models of Whisper on low-resource computers. Participants will learn how to utilize Whisper for speech recognition and then import the resulting texts into Subtitle Edit for further refinement and processing.

Speakers:

Dr. Sven Strobel, Product Owner – TIB AV-Portal, Germany

Andrija Sagic, Head of Digitization Department, Milutin Bojić Library, Serbia

About IFLA

Professional structure

Regional structure

Advisory Committees

Advocating for libraries

Inspiring and enhancing professional practice

Enabling and connecting libraries

Events calendar

IFLA Congress

IFLA Information Futures Summit

Become a member

Engage with IFLA

Support IFLA