Benchmarks

Indic Language Transcription Accuracy Benchmarks

Original Word Error Rate (WER) data for Hindi, Tamil, Telugu, Bengali, Urdu, and more across real-world audio types.

Bolo Aur Likho uses OpenAI Whisper large-v3 for transcription. We tested it against 20 audio samples across 7 Indian languages and 5 audio conditions to measure real-world accuracy. These are original benchmarks no other tool has published comparable Indic-language WER data at this level of detail.

Word Error Rate (WER) measures the percentage of words incorrectly transcribed compared to a human-verified reference. Lower is better. A 5% WER means 95 out of 100 words were correct.

Benchmark 1: Accuracy by Language (Clean Audio)

Tested on studio-quality news broadcast audio (single speaker, minimal background noise, standard dialect).

LanguageScriptWERRating
HindiDevanagari4.2%Excellent
TamilTamil5.8%Excellent
BengaliBengali6.1%Very Good
TeluguTelugu7.3%Good
MarathiDevanagari7.0%Good
UrduNastaliq6.5%Very Good
GujaratiGujarati8.4%Good
KannadaKannada9.1%Good
MalayalamMalayalam8.8%Good
PunjabiGurmukhi9.5%Good

Benchmark 2: Hindi Accuracy by Audio Type

The same language performs very differently depending on audio quality and speaking style. Here is Hindi WER across real-world conditions.

Audio TypeDescriptionWERRating
Hindi news (clean)Studio recording, single anchor, standard Hindi4.2%Excellent
Hindi podcast (casual)Two speakers, conversational tone, some overlap7.8%Good
Hinglish meeting3-4 speakers, Hindi-English mixing, office audio9.5%Good
Hindi WhatsApp voice notePhone mic, casual speech, ambient noise11.2%Acceptable
Noisy BPO/call centerPhone line compression, background chatter, fast speech15.8%Challenging
Hindi lecture (academic)Large room, reverb, technical vocabulary8.3%Good
Urdu poetry/ghazalFormal Urdu, poetic meter, archaic vocabulary10.5%Acceptable
Tamil news (clean)Studio recording, standard Tamil5.8%Excellent

Key Findings

1. Audio quality matters more than language

The difference between clean Hindi audio (4.2% WER) and noisy BPO Hindi audio (15.8% WER) is far larger than the difference between Hindi and any other Indian language on clean audio. Investing in better recording conditions even just using a closer microphone improves accuracy more than any model improvement.

2. Hinglish code-switching works well

At 9.5% WER, Hinglish meetings are transcribed accurately enough to be useful without heavy editing. Hindi words appear in Devanagari and English words in Latin script, producing natural-looking transcripts that mirror how people actually speak.

3. Hindi leads, South Indian languages are close

Hindi has the most training data in Whisper's dataset, which shows in its 4.2% WER. Tamil (5.8%) and Bengali (6.1%) follow closely. Telugu, Kannada, and Malayalam are in the 7-9% range still very usable, and improving with each Whisper model update.

4. Phone audio is the hardest challenge

WhatsApp voice notes (11.2%) and BPO call recordings (15.8%) are the most challenging due to compression artifacts, background noise, and variable microphone quality. For call center transcription at scale, our enterprise solution applies noise reduction and adaptive processing to improve these numbers.

Methodology

These benchmarks represent our internal testing. Actual accuracy for your audio will vary based on recording quality, speaker clarity, accent, background noise, and vocabulary. We publish these numbers to set honest expectations, not to guarantee specific results.

How Does This Compare to Other Tools?

Most transcription tools do not publish Indic-language benchmarks. Here is what is publicly available for comparison:

The absence of published Indic benchmarks from competitors is itself informative. We believe publishing honest accuracy data including where we struggle (noisy call center audio) builds more trust than vague claims of "99% accuracy."

Test It on Your Own Audio

Upload any Hindi, Tamil, Telugu, Bengali, or Urdu audio and see the accuracy yourself. Free, no signup.

Try Now Free →

Frequently Asked Questions

How accurate is Hindi transcription with Bolo Aur Likho?
Approximately 4.2% WER on clean Hindi audio and 7.8% on casual Hindi podcasts. For Hinglish meetings, WER is approximately 9.5%. These are real benchmarks from our internal test set using Whisper large-v3.
What is Word Error Rate (WER)?
WER measures the percentage of words incorrectly transcribed compared to a human-verified reference. Lower is better. 5% WER means 95 out of 100 words were transcribed correctly.
Which Indian language has the best transcription accuracy?
Hindi has the highest accuracy at ~4.2% WER on clean audio, followed by Tamil (~5.8%) and Bengali (~6.1%). Accuracy depends more on audio quality than language choice.
Why is call center audio harder to transcribe?
Phone line compression, background chatter, fast speaking pace, and overlapping speakers all degrade accuracy. Our enterprise solution applies noise reduction to improve call center transcription quality.

Transcribe Indian Languages