speech
Triggered when speech-to-text transcription is completed for a call recording. Transcription is processed asynchronously after the call ends.
When This Event Fires
- A call recording has been transcribed by the speech recognition system
- The transcription result is stored and ready for retrieval
Payload
| Field | Type | Description |
|---|---|---|
call_id | integer | Internal call ID |
speech_state | string | Transcription state (e.g. "speech_stored") |
payload | array | Transcription segments (see below) |
Transcription Segment Fields
Each item in the payload array represents a segment of the transcription:
| Field | Type | Description |
|---|---|---|
text | string | Transcribed text for this segment |
start | float | Start time offset |
end | float | End time offset |
start_sec | float | Start time in seconds from call beginning |
channel | string | Audio channel — "left" (caller) or "right" (receiver) |
Example Payload
{
"call_id": 12345,
"speech_state": "speech_stored",
"payload": [
{
"text": "Hello, I'd like to inquire about your services.",
"start": 0.5,
"end": 3.2,
"start_sec": 0.5,
"channel": "left"
},
{
"text": "Of course! How can I help you today?",
"start": 3.5,
"end": 5.8,
"start_sec": 3.5,
"channel": "right"
},
{
"text": "I'm looking for a business phone solution.",
"start": 6.0,
"end": 8.5,
"start_sec": 6.0,
"channel": "left"
}
]
}
Channel Mapping
The channel field indicates who is speaking: "left" is typically the caller (external party) and "right" is the receiver (your team member).
Related Events
incoming_call_end/outgoing_call_end— the call that was transcribedsummary— AI summary generated from this transcription