Powering virtual meetings with Speech to Text AI: Claap's success story with Gladia

Published on Jun 25, 2023
Powering virtual meetings with Speech to Text AI: Claap's success story with Gladia

A case study showcasing the benefits of Gladia's AI API for Claap, an all-in-one video workspace that implemented our solution to provide its international users with advanced video transcription capabilities.

Meet Claap

Claap is a leading all-in-one video workspace that helps organizations speed up decision-making by reducing back-to-back virtual meetings and endless messaging threads. Claap unifies meeting recording, screen recording, and a video wiki into a single collaborative workspace, unlocking a new level of alignment, transparency, and efficiency.

Founded in France in 2021, the company serves international customers like Revolut, Kavak, and Qonto, and helps them solve their remote collaboration challenges.

• 14 employees         

• 5,000+ customer workspaces        

• Global user base: French, English (US/UK), Spanish, German, Dutch,  Danish, Chinese, Hindi, Portuguese, Italian and other languages in use            

Preview of Claap's platform with built-in transcription, powered by Gladia
Claap's platform with built-in transcription, powered by Gladia
Claap’s team knew that in order to realize their vision, they needed to make video content a truly intuitive medium of information. This, in turn, meant that they had to make video content easy to summarize, categorize, search, and digest. A high-quality transcription tool was therefore crucial to their roadmap.

Challenge

Claap’s Co-founder and CTO, Thomas Hernandez, needed a highly accurate, fast, and easy-to-implement multilingual solution both for core transcription and for video intelligence add-ons.

While the Claap team considered building the solution in-house using a layer of add-ons on top of an open-source product, they felt that transcription was becoming enough of a commodity that a specialized API provider would be a better fit, thanks to reduced time-to-market and lower infrastructure costs.

The issue they encountered, however, was that the incumbent providers did not meet the near-perfect quality they were looking for, as they were primarily US-centric, with lower quality for non-native English speakers. They were also either too slow, or prohibitively expensive.

Objectives

To deploy a high-quality, scalable video transcription API and audio intelligence add-ons for all Claap's customers all around the globe.

Specifications

✔️ A high-quality transcription at a scalable cost so it could be deployed in every user space.

✔️ A solution adapted to speakers from multiple geographies, with different languages and regional accents.

✔️ A transcription technology that is fast enough to add a valuable layer of insight to user videos almost immediately.

Solution

With Gladia, the Claap team was able to implement:

  • A highly accurate transcription solution registering a WER between 1% and 3%.
  • Transcription at blazing speed, with one hour of video transcribed on average in under 60s.
  • A truly multilingual approach to transcription, with core features and add-on all available in 99+ languages.
  • A quick implementation and iteration cycle, with fixes and features released every week, as well as weekly client support.
  • Audio intelligence features such as speaker diarization, word-level timestamps, translations, and other features, enabling Claap to make content more accessible for their international users.

Thanks to our API, Claap was able to unlock the following AI-powered functionalities for its users:

  • Synced playback — follow along with a video transcript while you listen to the recording;
  • Speaker detection — automatically detect multiple speakers to jump in the right moment;
  • Search within video — search the full text transcript and find the exact moment you’re looking for;
  • Add comments while watching transcript — easily add comments related to a specific timestamp.
Image showing an interface of Claap, a virtual meeting platform
Speaker diarization as seen inside Claap

Impact

After deciding to add audio transcription to their roadmap, Claap started testing the API immediately. By working with the Gladia team to iterate and scale up, they saw a noticeable impact on their own customers, from users praising the quality of the transcription to prospects converting specifically after trying it out.

In a nutshell, Claap's case study illustrates the many possibilities unlocked by audio AI for businesses, aiming to improve user experience, boost retention and upgrade its core product functionalities - all with a single turnkey API.

We're thrilled to have been part of this amazing journey, and we owe a huge thanks to our client for putting their trust in us. As we move forward, we're excited to team up with more clients, tackle new challenges, and make AI more accessible to companies worldwide.

Having read this case study, do you feel like Gladia can be the right fit for your business too?

Don't hesitate to contact our sales team to discuss this in more detail. Beyond virtual meetings, we cater to a range of use cases, including content & media, call centers, workspace collaboration, and more.

Contact us

280
Your request has been registered
A problem occurred while submitting the form.

Read more

Speech-To-Text

ASR vs. LLMs – Why voice is among the biggest challenges for AI

When people talk about recent AI advancements, Large Language Models (LLMs) like ChatGPT often steal the limelight. They summarize, write, and generate text with impressive fluency, making them the poster child of generative AI.

Product News

Ultimate guide to using LLMs with speech recognition is here!

Large Language Models (LLMs) have enabled businesses to build advanced AI-driven features, but navigating the many available models and optimization techniques isn't always easy.

Speech-To-Text

What startups should look for in a speech-to-text API

The revolution in both LLMs and voice technology in recent years has opened up unprecedented opportunities for startups. From virtual meeting assistants to AI voice agents, speech-to-text (STT) capabilities are becoming central to modern applications. However, choosing the right STT API provider involves navigating a complex landscape of technical specifications, features, and trade-offs that can significantly impact your product's success.

Read more