Live API
Overview
The Live API provides real-time audio transcription via WebSocket connections. Connect with JWT authentication, send audio data, and receive immediate transcription results with speaker identification and timing information.
Secure & Real-time: Connect using JWT authentication for secure real-time audio streaming and transcription with automatic speaker identification.
How It Works
1
Authenticate & Connect
Connect to the WebSocket endpoint with your JWT token for secure access
2
Receive Session ID
Get a unique session identifier for your transcription session
3
Stream Audio
Send audio data in real-time (WAV format recommended)
4
Get Transcriptions
Receive immediate transcription results with speaker identification and timing
Key Features
Real-time Processing
Immediate transcription results as audio is streamed
Speaker Identification
Automatic speaker diarization with clear labeling
JWT Authentication
Secure connections with token-based authentication
High Accuracy
Medical-grade transcription accuracy for professional use
Connection Details
WebSocket Endpoint:ws://localhost:8000/ws/transcribe
Authentication: JWT token required
Protocol: WebSocket with binary audio streaming
Response Format: JSON messages with transcription results
Message Flow
Authentication
JWT Token Requirements
The Live API requires a valid JWT token for authentication:- Headers
- Query Parameter
Token Requirements
Token Format
- Type: JWT (JSON Web Token)
- Algorithm: HS256 or RS256
- Expiration: Must be valid and not expired
- Claims: Must include user/session information
Security
- HTTPS: Use secure connections in production
- Token Validation: Server validates token on connection
- Session Management: Each connection gets unique session ID
- Access Control: Token determines user permissions
Audio Specifications
Format Requirements
- Encoding: WAV (recommended) or raw PCM
- Sample Rate: 16000 Hz (recommended)
- Channels: Mono (1 channel)
- Bit Depth: 16-bit
- Streaming: Real-time binary data
Quality Guidelines
- Clear Audio: Minimal background noise
- Consistent Volume: Stable audio levels
- Chunk Size: 1-5 seconds recommended
Quick Start Example
Error Handling
Authentication Errors
Invalid Token
Invalid Token
Problem: JWT token is invalid or malformedResponse: Connection rejected with 401 UnauthorizedSolution: Verify token format and signature
Expired Token
Expired Token
Problem: JWT token has expiredResponse: Connection rejected with 401 UnauthorizedSolution: Refresh token and reconnect
Missing Token
Missing Token
Problem: No authentication token providedResponse: Connection rejected with 401 UnauthorizedSolution: Include JWT token in connection request
Coming Soon
Future Features: Advanced audio processing, real-time language detection, medical terminology highlighting, and multi-language support will be added in future versions.