AdvancedSpeech Recognition Options

Speech Recognition Options

Model Configuration Options

OptionTypeDefaultDescription
devicestring’webgpu’Device to run model on
onProgressfunction-Callback for loading progress updates
quantizationstring’q4’Model quantization level

Available Models

ModelDescriptionSizeLanguages
whisper-tiny-enLightweight English-only model~150MBEnglish
whisper-base-allBase multilingual model~290MB99 languages
whisper-small-allEnhanced multilingual model~490MB99 languages

Transcription Parameters

ParameterTypeDefaultDescription
languagestring-Target language code (optional)
taskstring’transcribe’Task type (‘transcribe’ or ‘translate’)
return_timestampsbooleanfalseReturn word-level timestamps

Example

const browserAI = new BrowserAI();
 
// Load speech recognition model
await browserAI.loadModel('whisper-tiny-en', {
  device: 'webgpu',
  onProgress: (progress) => {
    console.log('Loading model:', progress.progress + '%');
  }
});
 
// Transcribe audio from blob/array
const result = await browserAI.transcribe(audioInput, {
  language: 'en',
  return_timestamps: true,
  chunk_length_s: 30
});
 
console.log('Transcription:', result.text);
// "Hello, how are you today?"

Recording Audio

BrowserAI provides built-in methods for recording audio:

// Start recording
await browserAI.startRecording();
 
// Stop and get audio blob
const audioBlob = await browserAI.stopRecording();
 
// Transcribe the recorded audio
const transcription = await browserAI.transcribeAudio(audioBlob);
console.log(transcription.text);