Skip to main content

Live API

Overview

The Live API provides real-time audio transcription via WebSocket connections. Connect with JWT authentication, send audio data, and receive immediate transcription results with speaker identification and timing information. MediSync Live API Pipeline
Secure & Real-time: Connect using JWT authentication for secure real-time audio streaming and transcription with automatic speaker identification.

How It Works

1

Authenticate & Connect

Connect to the WebSocket endpoint with your JWT token for secure access
2

Receive Session ID

Get a unique session identifier for your transcription session
3

Stream Audio

Send audio data in real-time (WAV format recommended)
4

Get Transcriptions

Receive immediate transcription results with speaker identification and timing

Key Features

Real-time Processing

Immediate transcription results as audio is streamed

Speaker Identification

Automatic speaker diarization with clear labeling

JWT Authentication

Secure connections with token-based authentication

High Accuracy

Medical-grade transcription accuracy for professional use

Connection Details

WebSocket Endpoint: ws://localhost:8000/ws/transcribe Authentication: JWT token required Protocol: WebSocket with binary audio streaming Response Format: JSON messages with transcription results

Message Flow

Authentication

JWT Token Requirements

The Live API requires a valid JWT token for authentication:
const ws = new WebSocket('ws://localhost:8000/ws/transcribe', [], {
  headers: {
    'Authorization': `Bearer ${jwtToken}`
  }
});

Token Requirements

Token Format

  • Type: JWT (JSON Web Token)
  • Algorithm: HS256 or RS256
  • Expiration: Must be valid and not expired
  • Claims: Must include user/session information

Security

  • HTTPS: Use secure connections in production
  • Token Validation: Server validates token on connection
  • Session Management: Each connection gets unique session ID
  • Access Control: Token determines user permissions

Audio Specifications

Format Requirements

  • Encoding: WAV (recommended) or raw PCM
  • Sample Rate: 16000 Hz (recommended)
  • Channels: Mono (1 channel)
  • Bit Depth: 16-bit
  • Streaming: Real-time binary data

Quality Guidelines

  • Clear Audio: Minimal background noise
  • Consistent Volume: Stable audio levels
  • Chunk Size: 1-5 seconds recommended

Quick Start Example

// Connect with JWT authentication
const jwtToken = 'your-jwt-token-here';
const ws = new WebSocket('ws://localhost:8000/ws/transcribe', [], {
  headers: {
    'Authorization': `Bearer ${jwtToken}`
  }
});

// Handle connection
ws.onopen = () => {
  console.log('Connected to Live API');
};

// Handle messages
ws.onmessage = (event) => {
  const message = JSON.parse(event.data);
  
  switch (message.type) {
    case 'session_init':
      console.log('Session ID:', message.session_id);
      startAudioStreaming();
      break;
      
    case 'transcription':
      console.log('Transcription:', message.text);
      displayTranscription(message);
      break;
      
    case 'warning':
      console.warn('Warning:', message.message);
      break;
  }
};

// Send audio data
function sendAudio(audioBuffer) {
  if (ws.readyState === WebSocket.OPEN) {
    ws.send(audioBuffer);
  }
}

Error Handling

Authentication Errors

Problem: JWT token is invalid or malformedResponse: Connection rejected with 401 UnauthorizedSolution: Verify token format and signature
Problem: JWT token has expiredResponse: Connection rejected with 401 UnauthorizedSolution: Refresh token and reconnect
Problem: No authentication token providedResponse: Connection rejected with 401 UnauthorizedSolution: Include JWT token in connection request

Coming Soon

Future Features: Advanced audio processing, real-time language detection, medical terminology highlighting, and multi-language support will be added in future versions.

Next Steps