Skip to main content

speech

Triggered when speech-to-text transcription is completed for a call recording. Transcription is processed asynchronously after the call ends.

When This Event Fires

  • A call recording has been transcribed by the speech recognition system
  • The transcription result is stored and ready for retrieval

Payload

FieldTypeDescription
call_idintegerInternal call ID
speech_statestringTranscription state (e.g. "speech_stored")
payloadarrayTranscription segments (see below)

Transcription Segment Fields

Each item in the payload array represents a segment of the transcription:

FieldTypeDescription
textstringTranscribed text for this segment
startfloatStart time offset
endfloatEnd time offset
start_secfloatStart time in seconds from call beginning
channelstringAudio channel — "left" (caller) or "right" (receiver)

Example Payload

{
"call_id": 12345,
"speech_state": "speech_stored",
"payload": [
{
"text": "Hello, I'd like to inquire about your services.",
"start": 0.5,
"end": 3.2,
"start_sec": 0.5,
"channel": "left"
},
{
"text": "Of course! How can I help you today?",
"start": 3.5,
"end": 5.8,
"start_sec": 3.5,
"channel": "right"
},
{
"text": "I'm looking for a business phone solution.",
"start": 6.0,
"end": 8.5,
"start_sec": 6.0,
"channel": "left"
}
]
}
Channel Mapping

The channel field indicates who is speaking: "left" is typically the caller (external party) and "right" is the receiver (your team member).