Blog

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Tutorials

How to build a speaker identification system for recorded online meetings

Virtual meeting recordings are becoming increasingly used as a source of valuable business knowledge. However, given the large amount of audio data produced in meetings by companies, getting the full value out of recorded meetings can be tricky.

Speech-To-Text

Should you trust WER?

Word Error Rate (WER) is a metric that evaluates the performance of ASR systems by analyzing the accuracy of speech-to-text results. WER metric allows developers, scientists, and researchers to assess ASR performance. A lower WER indicates better ASR performance, and vice versa. The assessment allows for optimizing the ASR technologies over time and helps to compare speech-to-text models and providers for commercial use. 

Speech-To-Text

OpenAI Whisper vs Google Speech-to-Text vs Amazon Transcribe: The ASR rundown

Speech recognition models and APIs are crucial in building apps for various industries, including healthcare, customer service, online meetings, and entertainment.

Speech-To-Text

Best open-source speech-to-text models

Automatic speech recognition, also known as speech-to-text (STT), has been around for some decades, but the advances of the last two decades in both hardware and software, especially for artificial intelligence, made the technology more robust and accessible than ever before.

Case Studies

How Gladia's multilingual audio-to-text API supercharges Carv's AI for recruiters

In today's professional landscape, the average workday of a recruiter is characterized by a perpetual cycle of administrative tasks, alternated by intake calls with hiring managers and interviews with candidates. And while recruiters enjoy connecting with hiring managers and candidates, there’s an almost universal disdain for the administrative side of the job.

Speech-To-Text

How do ASR models work?

Automatic speech recognition (ASR) is a cornerstone of many business applications in domains ranging from call centers to smart device engineering. At their core, ASR models, also referred to as Speech-to-Text (STT), intelligently recognize human speech and convert it into a written format.