Skip to main content

Voice Message Transcription

This article explains how voice message transcription works in charles, where to enable it, and how transcripts can be used by human agents, flows, and AI agents.

Overview

Automatically convert voice messages from your contacts into readable text, making them faster to handle for human agents and usable as input for flows and AI agents.


When contacts send voice messages on WhatsApp or RCS, your team normally has to press play to understand what was said. Voice message transcription changes that: the transcript appears directly in the conversation alongside the original audio, so anyone or anything handling the conversation can read the message instead of only listening to it.

What Is Voice Message Transcription?

Voice message transcription uses AI to convert incoming voice messages into plain text. The transcript is shown in the conversation thread next to the original audio recording.

It works for WhatsApp and RCS voice messages and is off by default. You can turn it on whenever you're ready.

⚠️ Voice message transcription uses AI. Transcripts may not be 100% accurate. See AI accuracy: what to expect below.

How to Enable Voice Message Transcription

You can manage the setting directly in charles:

  1. Go to Settings

  2. Click Configurations

  3. Find Voice message transcription and toggle it on

The setting applies to all conversations across your charles universe β€” there's no per-inbox or per-conversation toggle.

Where Transcripts Are Useful

Once enabled, transcripts appear everywhere a voice message would otherwise show up. They become usable in three places:
​

For Human Agents

Human agents can read what a contact sent at a glance without having to listen to the audio first. The original voice message stays in the conversation and can still be played at any time.

In Flows

Flows receive the transcript as plain text. Keyword matching, intents, and any other text-based logic in your journey flows can now respond to voice input the same way they respond to typed messages.

πŸ“ Flows only receive the text transcript β€” never the audio file itself.

For AI Agents

AI agents receive the transcript as plain text and can reply to voice messages just like typed ones. Without transcription enabled, AI agents have no way to interpret voice input at all.

πŸ“ AI agents only receive the text transcript β€” never the audio file itself.


AI Accuracy: What to Expect

Voice message transcription is powered by AI, and like any AI feature it isn't perfect. A few things worth knowing:

  1. Transcripts are not 100% accurate. Expect occasional errors, especially around brand names, product SKUs, addresses, or industry-specific terms.

  2. Quality depends on the audio. Background noise, strong accents, fast speech, and short or low-quality recordings can all reduce transcription quality.

  3. Always check the original audio for anything critical. If a transcript looks off β€” or if the message involves a complaint, legal matter, or payment detail β€” human agents can play the voice message before taking action. The original audio is always available alongside the transcript.

πŸ’‘ Tip: Brief your human agents that transcripts are AI-generated, so they know to double-check when something doesn't read quite right.


Need Help?

If you have questions about voice message transcription or run into any issues, reach out to our support team via the chat bubble in the bottom right corner of your charles account. We're happy to help! πŸ’›

Did this answer your question?