Home
Blog
Powering virtual meetings with Speech to Text AI: Claap's success story with Gladia

Powering virtual meetings with Speech to Text AI: Claap's success story with Gladia

Powering virtual meetings with Speech to Text AI: Claap's success story with Gladia
Published on
Sep 2024

A case study showcasing the benefits of Gladia's AI API for Claap, an all-in-one video workspace that implemented our solution to provide its international users with advanced video transcription capabilities.

Meet Claap

Claap is a leading all-in-one video workspace that helps organizations speed up decision-making by reducing back-to-back virtual meetings and endless messaging threads. Claap unifies meeting recording, screen recording, and a video wiki into a single collaborative workspace, unlocking a new level of alignment, transparency, and efficiency.

Founded in France in 2021, the company serves international customers like Revolut, Kavak, and Qonto, and helps them solve their remote collaboration challenges.

• 14 employees         

• 5,000+ customer workspaces        

• Global user base: French, English (US/UK), Spanish, German, Dutch,  Danish, Chinese, Hindi, Portuguese, Italian and other languages in use            

Preview of Claap's platform with built-in transcription, powered by Gladia
Claap's platform with built-in transcription, powered by Gladia
Claap’s team knew that in order to realize their vision, they needed to make video content a truly intuitive medium of information. This, in turn, meant that they had to make video content easy to summarize, categorize, search, and digest. A high-quality transcription tool was therefore crucial to their roadmap.

Challenge

Claap’s Co-founder and CTO, Thomas Hernandez, needed a highly accurate, fast, and easy-to-implement multilingual solution both for core transcription and for video intelligence add-ons.

While the Claap team considered building the solution in-house using a layer of add-ons on top of an open-source product, they felt that transcription was becoming enough of a commodity that a specialized API provider would be a better fit, thanks to reduced time-to-market and lower infrastructure costs.

The issue they encountered, however, was that the incumbent providers did not meet the near-perfect quality they were looking for, as they were primarily US-centric, with lower quality for non-native English speakers. They were also either too slow, or prohibitively expensive.

Objectives

To deploy a high-quality, scalable video transcription API and audio intelligence add-ons for all Claap's customers all around the globe.

Specifications

✔️ A high-quality transcription at a scalable cost so it could be deployed in every user space.

✔️ A solution adapted to speakers from multiple geographies, with different languages and regional accents.

✔️ A transcription technology that is fast enough to add a valuable layer of insight to user videos almost immediately.

Solution

With Gladia, the Claap team was able to implement:

  • A highly accurate transcription solution registering a WER between 1% and 3%.
  • Transcription at blazing speed, with one hour of video transcribed on average in under 60s.
  • A truly multilingual approach to transcription, with core features and add-on all available in 99+ languages.
  • A quick implementation and iteration cycle, with fixes and features released every week, as well as weekly client support.
  • Audio intelligence features such as speaker diarization, word-level timestamps, translations, and other features, enabling Claap to make content more accessible for their international users.

Thanks to our API, Claap was able to unlock the following AI-powered functionalities for its users:

  • Synced playback — follow along with a video transcript while you listen to the recording;
  • Speaker detection — automatically detect multiple speakers to jump in the right moment;
  • Search within video — search the full text transcript and find the exact moment you’re looking for;
  • Add comments while watching transcript — easily add comments related to a specific timestamp.
Image showing an interface of Claap, a virtual meeting platform
Speaker diarization as seen inside Claap

Impact

After deciding to add audio transcription to their roadmap, Claap started testing the API immediately. By working with the Gladia team to iterate and scale up, they saw a noticeable impact on their own customers, from users praising the quality of the transcription to prospects converting specifically after trying it out.

In a nutshell, Claap's case study illustrates the many possibilities unlocked by audio AI for businesses, aiming to improve user experience, boost retention and upgrade its core product functionalities - all with a single turnkey API.

We're thrilled to have been part of this amazing journey, and we owe a huge thanks to our client for putting their trust in us. As we move forward, we're excited to team up with more clients, tackle new challenges, and make AI more accessible to companies worldwide.

Having read this case study, do you feel like Gladia can be the right fit for your business too?

Don't hesitate to contact our sales team to discuss this in more detail. Beyond virtual meetings, we cater to a range of use cases, including content & media, call centers, workspace collaboration, and more.

Contact us

280
Your request has been registered
A problem occurred while submitting the form.

Read more

Speech-To-Text

Keeping LLMs accurate: Your guide to reducing hallucinations

Over the last few years, Large Language Models (LLMs) have become accessible and transformative tools, powering everything from customer support and content generation to complex, industry-specific applications in healthcare, education, and finance.

Case Studies

Transforming note-taking for students with AI transcription

In recent years, fuelled by advancements in LLMs, the numbers of AI note-takers has skyrocketed. These apps are increasingly tailored to meet the unique needs of specific user groups, such as doctors, sales teams and project managers.

Speech-To-Text

RAG for voice platforms: combining the power of LLMs with real-time knowledge

It happens all the time. A user submits a query to a large language model (LLM) and swiftly gets a response that is clear, comprehensive, and obviously incorrect.