Amplify: audio transcription
Learn more about the Amplify transcription platform, how you can get involved and the resources required and factors to consider to make Amplify successful for your library.
Table of contents
INTRODUCTION
- What is Amplify?
- Why is audio transcription important?
- How can my library get involved?
- How can I make Amplify successful for my library?
MANAGING COLLECTIONS
- What audio collections are best suited to Amplify?
- File requirements
- Copyright, privacy and censorship
ENGAGING COMMUNITY PARTICIPATION
- Promotional and marketing techniques
- Volunteer management
- Collaboration with community groups/partners
COMMUNITY OF PRACTICE
- Benefits & expectations
- Peer to peer support
- Sharing local experiences and strategies
EXIT STRATEGY AND LMS BEST PRACTICE
- Getting your material off Amplify
- Finding a permanent "home" for access to your transcripts
Introductions
What is Amplify?
Amplify is a crowdsourcing platform where libraries can temporarily place digital audio materials from their collections paired with machine-generated transcripts to engage people in the community to help correct any errors in the transcripts. Once completed, your audio materials and newly corrected transcripts can be moved off Amplify to a permanent location of your choice, such as your library management system or library website.
The aim of Amplify is to promote access to rich cultural collections from across New South Wales, as well as to provide an engaging and user-friendly tool to create and correct transcripts at low cost. The benefit of Amplify is not only to provide online access to previously inaccessible material, but also to create an opportunity for libraries to engage with communities of interest in correcting any errors or gaps in the machine generated material. It is a great way for researchers, volunteers and the general public to get ‘hands-on’ with cultural heritage and listen in depth to the stories of NSW and Australian history.
Why is audio transcription important?
If you have audio material in your library's collection, it is very likely that there are users of your library service who want to access it - ideally online! As well as digitising your audio collections for the purpose of preservation and access, providing text-based transcripts is an important way to enrich the value and usefulness of your local studies material.
Improving the accessibility of your collections
Providing access to transcripts for your audio material ensures that your collections can be accessed by all members of your community. All government institutions should be committed to making any content they publish online accessible to all users, which includes members of your community that may access material in alternative ways, such as with the use of screen readers or other assistive technologies. The Web Content Accessibility Guidelines (WCAG) set out recommendations and criteria that Australian government institutions, including libraries, are legislated to follow under the Disability Discrimination Act 1992 (Cth). The guidelines make clear that all multimedia content, including audio and video material, should be published online with an accompanying 'text-based alternative' in order to meet baseline accessibility requirements.
Improving your research experience
As well as meeting accessibility requirements, providing text-based transcripts of your audio collections will ensure an easier and more productive research experience for users of your collections. Many researchers prefer to browse transcripts rather than listen to hours of audio material. Transcripts make it much easier to browse key themes and topics within a recording, as well as search for and identify named places and people. Transcripts can also make your audio collections more discoverable by being indexed by search engines such as your library catalogue or Google, making it easier for users to find relevant material.
How can my Library get involved?
After a successful pilot project in partnership with selected public libraries in 2018, the State Library is now offering the use of Amplify to eligible NSW public libraries.
By joining the Amplify platform, your library will be set up with its own presence to host your digitised oral history collections, making them available for listening and transcription to the public. As well as complimentary training, you will also receive membership to the Amplify Community of Practice, an online hub providing support, platform updates, resources, best practice advice and ongoing peer-to-peer communication in the use of Amplify. Learn more about how your library can access Amplify for your own oral history collections.
How can I make Amplify successful for my library?
Amplify provides a collaborative and cost effective way to create transcripts for your audio collections. Using a third party machine-learning service, Amplify creates computer-generated transcripts for every audio file you upload. These computer-generated transcripts can often include errors and Amplify acts as a crowdsourcing platform where you can engage your local community to assist in the review and correction of your transcripts.
While much of the platform is technology driven, Amplify is not a platform where you can 'set and forget' your audio collections. Rather, Amplify is a temporary tool for transcription and community engagement. There are several factors to consider to ensure Amplify is a success for your library.
Digitised collections
You must have digital - either digitised or born digital - audio files prepared prior to commencing your use of Amplify. Ideally, you will also have access to digital image collections to accompany your audio material.
Staff resourcing
Like any other digital content platform, Amplify requires ongoing resourcing. Tasks will include preparing, managing and publishing content, community engagement, transcription moderation and participation in the Amplify Community of Practice. Achieving corrected and completed transcripts for your audio collections will not be possible without dedicated and sustained resourcing from your team.
Community engagement
Engaging with your stakeholders is a critical component to making Amplify successful for your library. Your local community is the most likely to be invested in your audio material, and therefore the most likely to participate in the review and correction of transcripts. Sustaining this community interest will be imperative to reaching your goal of completed transcripts within a reasonable timeframe. See the Engaging Community Participation section for more details about potential community engagement strategies.
Managing collections
What audio collections are best suited to Amplify?
Amplify is designed to cater to oral history recordings with any number of people speaking clearly. The better quality the recording, taking into consideration factors like minimal background noise, good quality microphones, and clear and audible annunciation by speakers, the more likely a better quality machine-generated transcript will be produced. If a recording from your collection is of poorer quality it can still be used on Amplify, but be wary that this will produce a poorer quality original machine-generated transcript, meaning that it will require more correction and therefore take longer to be completed.
The best results generally come from recordings of interviews and conversations with only two people, however group discussions can also be used, although it is often hard to clearly identify individual speakers within a recording and multiple voices often result in a less accurate machine-generated transcript.
Amplify cannot be used for any songs or other musical recordings. Poetry readings are acceptable.
You can learn more about the types of collections and content that are most suitable to Amplify by reading our case studies.
Languages other than English
Amplify has limited capacity to transcribe recordings in languages other than English, which is due to limitations of the third party machine transcription service used.
Recordings in the following languages can be uploaded: French, German, Italian, Dutch, Spanish and Portuguese. Please note that transcripts of any recordings in these languages will not be translated into English, so the transcripts on Amplify must also be in the same language as the recording itself. Audio collections of this nature could provide a great engagement opportunity with language speaking groups within your community.
For recordings that have mixed languages, for example English and German spoken within the one interview, the transcript will only be in English, meaning that anything spoken in German will have been attempted by the machine service to be transcribed in English. These type of transcripts will definitely contain errors that will need careful attention to correct.
File requirements
Audio files
Audio file types acceptable for uploading to Amplify are mp3, mp4 or m4a formats. If your digital audio files are in WAV format for the purpose of preservation, you will need to create an access derivative from these master files in one of the three acceptable formats noted above.
The bitrate of your audio files is important because this will impact the quality of your machine-generated transcript. The recommended bitrate for all audio files is 96kbps (kilobits per second). Anything lower than this may decrease the accuracy of your transcript, and anything higher than 96kbps will have minimal impact on improving your transcript quality, but will create much larger file sizes which will use up more of the allocated storage space unnecessarily.
Image files
Image files paired with your audio collections can be uploaded in JPEG or PNG format. Images should be at a minimum resolution of 72dpi (dots per inch) and have a minimum width of 1000 pixels, ensuring that images will be high enough quality at any screen size.
Copyright, privacy and censorship
Copyright and privacy of collection materials can be complex issues. It is your responsibility to ensure that all collections you publish online are either out of copyright or permission has been sought from the copyright holder to make available online. You are also responsible for ensuring that any audio files or images you upload do not breach any privacy restrictions that may be applicable to your collections.
The State Library of NSW recommends taking a risk management approach to managing copyright, especially when the copyright status of a particular collection is not known, or there are privacy issues and any concerns about sensitive subject material.
To determine the level of risk, a risk management matrix may assess factors like collection age, subject matter, historical context, cultural sensitivities, and how the material was acquired. When a collection is considered low to medium risk, it is still suitable for publishing on Amplify, supported by a takedown policy. The State Library of NSW has a standard takedown policy that applies to all material published on Amplify.
An example of a low risk collection might be a series of interviews commissioned by your library whereby you are the copyright holder and each speaker signed a rights agreement form. An example of a potentially high risk collection might be a recently recorded series of interviews with local residents where the interviewees state their home address or other personal information.
It is also important to consider whether any collections you wish to upload to Amplify may contain any sensitive or potentially controversial subject matter. Culturally sensitive or defamatory content should be carefully considered before publishing online.
Engaging community participation
It is imperative for you to have strategies in place for engaging community participation in listening to and transcribing your audio collections. Your collections most likely focus on your local area or a specialised topic area related to your community, so your community is best placed to help transcribe these collections. There are a number of strategies that your library could use to promote engagement:
- Promote collections to local community through regular communication channels such as social media, library and council websites and eNewsletters.
- Harness the passion and subject matter expertise within your local area by promoting specific collections to known community groups. For example, if you have a collection of interviews with volunteer firefighters, target the promotion of the collection to your local fire brigade Facebook group.
- Partner with local organisations such as historical societies and schools to recruit volunteer transcribers. School students may like to work together on transcribing a collection for a group project.
- Recruit regular onsite volunteers who have an interest in transcribing and who would like to support your library.
- Create a dedicated Amplify space in your library to encourage your visitors to listen to and transcribe your collections. All you need is a computer and a set of headphones.
- Participate in a local pop up event promoting your collections on Amplify - take a couple of laptops and headphones with you. You could run a 'transcribe-a-thon' event with a specific goal set, like successfully completing transcripts of 10 interviews by the end of the day.
- Contact interviewees and their families directly to let them know the collections are available. You are likely to have more participation in transcription from the community when they have a personal connection to the interviewee or subject matter.
Community of Practice
When signing up to use the Amplify platform, you will also join an online Community of Practice for all participating organisations. The Community of Practice is a space for all organisational users of Amplify to share resources, ask questions and troubleshoot issues, share success stories and advice, provide feedback, help prioritise feature development and provide peer-to-peer support. The Community of Practice is intended to ensure all Amplify users are getting the most out of the platform for their audio collections and that all users can learn from the experiences of others by sharing local experiences and strategies, as well as identifying opportunities to collaborate together whenever possible.
The Community of Practice online space includes features such as:
- Knowledge base
- Resources and documentation
- Monthly usage reports
- News and alerts
- Instant group chat
- Feature requests and development opportunities
- Shared calendar
Exit strategy and LMS best practice
Amplify is a tool for creating transcripts for your audio collections. It is not a permanent 'home' for the hosting of these collections. When you sign up to use Amplify, you are committing to actively using the platform to engage users to review and correct your transcripts in a timely manner. Once your transcripts have been completed, your collections should be removed from Amplify to make way for new collections to be added and transcribed.
There are multiple strategies you could use to encourage engagement with your collections. For example, some organisations might decide to upload an entire collection which includes multiple files at one time. This is a good way to promote a collection to your community and gives users the opportunity to gain a better understanding of the collection as a whole. This can be effective if you have a collection that is made up of a series of consecutive or related recordings rather than standalone files. The consequence of uploading collections with a number of files means that it will take longer for all transcripts to be reviewed and completed.
As an alternative example, some organisations may choose to upload only one or two recordings at any one time. This means focusing on the promotion of a single interview or recording with a singular topic area or subject matter. This may work well if you have a smaller number of files in your collection, or if you're hoping to get a particular transcript completed more quickly.
Whatever strategy you use to promote and engage your collections, the aim is to have the audio material transcribed as quickly as possible so the completed files can be removed from Amplify to make way for new collections.
As a result you will need to consider what solutions are in place to provide permanent access to your audio collections and transcripts. Will permanent access be provided through the library catalogue and/or a dedicated project website. Where will the final transcript files be stored? It is important to consider a workflow for ongoing access to your audio collections prior to using Amplify, to ensure Amplify is part of the overarching strategy related to the preservation and access of your collections.