Powering virtual meetings with Speech to Text AI: Claap's success story with Gladia
Published on
Sep 2024
A case study showcasing the benefits of Gladia's AI API for Claap, an all-in-one video workspace that implemented our solution to provide its international users with advanced video transcription capabilities.
Meet Claap
Claap is a leading all-in-one video workspace that helps organizations speed up decision-making by reducing back-to-back virtual meetings and endless messaging threads. Claap unifies meeting recording, screen recording, and a video wiki into a single collaborative workspace, unlocking a new level of alignment, transparency, and efficiency.
Founded in France in 2021, the company serves international customers like Revolut, Kavak, and Qonto, and helps them solve their remote collaboration challenges.
• 14 employees
• 5,000+ customer workspaces
• Global user base: French, English (US/UK), Spanish, German, Dutch, Danish, Chinese, Hindi, Portuguese, Italian and other languages in use
Claap’s team knew that in order to realize their vision, they needed to make video content a truly intuitive medium of information. This, in turn, meant that they had to make video content easy to summarize, categorize, search, and digest. A high-quality transcription tool was therefore crucial to their roadmap.
Challenge
Claap’s Co-founder and CTO, Thomas Hernandez, needed a highly accurate, fast, and easy-to-implement multilingual solution both for core transcription and for video intelligence add-ons.
While the Claap team considered building the solution in-house using a layer of add-ons on top of an open-source product, they felt that transcription was becoming enough of a commodity that a specialized API provider would be a better fit, thanks to reduced time-to-market and lower infrastructure costs.
The issue they encountered, however, was that the incumbent providers did not meet the near-perfect quality they were looking for, as they were primarily US-centric, with lower quality for non-native English speakers. They were also either too slow, or prohibitively expensive.
Objectives
To deploy a high-quality, scalable video transcription API and audio intelligence add-ons for all Claap's customers all around the globe.
Specifications
✔️ A high-quality transcription at a scalable cost so it could be deployed in every user space.
✔️ A solution adapted to speakers from multiple geographies, with different languages and regional accents.
✔️ A transcription technology that is fast enough to add a valuable layer of insight to user videos almost immediately.
Solution
With Gladia, the Claap team was able to implement:
A highly accurate transcription solution registering a WER between 1% and 3%.
Transcription at blazing speed, with one hour of video transcribed on average in under 60s.
A truly multilingual approach to transcription, with core features and add-on all available in 99+ languages.
A quick implementation and iteration cycle, with fixes and features released every week, as well as weekly client support.
Audio intelligence features such as speaker diarization, word-level timestamps, translations, and other features, enabling Claap to make content more accessible for their international users.
Thanks to our API, Claap was able to unlock the following AI-powered functionalities for its users:
Synced playback — follow along with a video transcript while you listen to the recording;
Speaker detection — automatically detect multiple speakers to jump in the right moment;
Search within video — search the full text transcript and find the exact moment you’re looking for;
Add comments while watching transcript — easily add comments related to a specific timestamp.
Impact
After deciding to add audio transcription to their roadmap, Claap started testing the API immediately. By working with the Gladia team to iterate and scale up, they saw a noticeable impact on their own customers, from users praising the quality of the transcription to prospects converting specifically after trying it out.
In a nutshell, Claap's case study illustrates the many possibilities unlocked by audio AI for businesses, aiming to improve user experience, boost retention and upgrade its core product functionalities - all with a single turnkey API.
We're thrilled to have been part of this amazing journey, and we owe a huge thanks to our client for putting their trust in us. As we move forward, we're excited to team up with more clients, tackle new challenges, and make AI more accessible to companies worldwide.
Having read this case study, do you feel like Gladia can be the right fit for your business too?
Ultimate guide to using LLMs with speech recognition is here!
Large Language Models (LLMs) have enabled businesses to build advanced AI-driven features, but navigating the many available models and optimization techniques isn't always easy.
It’s that time of year again when we compile the top speech-to-text APIs to keep an eye on in 2025. Whether you’re looking to add voice-based AI into your products to automate customer support, enhance note-taking, supercharge your meetings, or more, this list will help you narrow-in on the right provider for your needs.
Key techniques to improve the accuracy of your LLM app: Prompt engineering vs Fine-tuning vs RAG
Large Language Models (LLMs) are at the forefront of the democratization of AI and they continue to get more advanced. However, LLMs can suffer from performance issues, and produce inaccurate, misleading, or biased information, leading to poor user experience and creating difficulties for product builders.