Important: This documentation covers Yarn 1 (Classic).
For Yarn 2+ docs and migration guide, see yarnpkg.com.

Package detail

voice-assistant-widget

anshkumar1.5kMIT3.1.2

Embeddable voice assistant widget for web applications

voice-assistant, widget, websocket, speech-recognition, text-to-speech, noise-suppression

readme

Fonada Voice Assistant

A complete voice assistant pipeline integrating:

  • Custom ASR (Automatic Speech Recognition)
  • Custom Turn detection with ReplyOnPause handler
  • LLM for conversational responses
  • Custom Fonada TTS for high-quality voice synthesis

Prerequisites

  • Python 3.8+
  • 4 CUDA-capable GPU
  • 50 GB+ disk space
  • Microphone and speakers

Setup

  1. Install the required dependencies: Install NeMo from github.
pip install -r requirements.txt
  1. Run LLM server `

lmdeploy serve api_server hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 --server-port 23333 --quant-policy 4

or

export CUDA_VISIBLE_DEVICES=2 lmdeploy serve api_server sarvamai/sarvam-m \ --server-port 8000 \ --tp 1 \ --backend turbomind \ --quant-policy 4 \ --cache-max-entry-count 0.9


3. Run TTS server from `models/` folder

export CUDA_VISIBLE_DEVICES=1 lmdeploy serve api_server tts_hindi --server-port 23334 --quant-policy 4

## Running the Voice Assistant

Run the assistant with:

```bash
export LD_LIBRARY_PATH=/workspace/TensorRT-10.10.0.31/lib:$LD_LIBRARY_PATH
export OPENAI_API_ASR_KEY=
export SARVAM_API_KEY=
export DEEPGRAM_API_KEY=
export OPENAI_API_LLM_KEY=
export GROQ_API_LLM_KEY=
python app.py

Change the path accordingly to your TensorRT path and API key. This will start a web server and open a browser interface where you can interact with the voice assistant.

Usage

  1. Click the microphone button to start speaking
  2. The assistant will automatically detect when you've finished speaking
  3. It will transcribe your speech, generate a response with LLama 3.2, and speak the response using Fonada TTS
  4. You can interrupt the assistant by speaking while it's responding

Customization

Voice Selection

To change the voice used by Fonada TTS, modify the options dictionary in the text_to_speech_sync method:

options = {"voice_id": "Ananya"}  # Change to your preferred voice

Available voices: "Rahul", "Vikram", "Arjun", "Dev", "Sanjay", "Jaya", "Meera", "Priya", "Ananya", "Divya"

System Prompt

To change how the LLM responds, customize the system prompt when initializing the VoiceAssistant:

assistant = VoiceAssistant(
    llm_model_path=llm_model_path,
    tts_model_path=tts_model_path,
    system_prompt="You are a helpful voice assistant. Keep your responses short and friendly."
)

Turn Detection Sensitivity

Adjust the turn detection parameters in the create_voice_assistant_stream() function to change how the assistant detects when you've finished speaking:

algo_options=AlgoOptions(
    audio_chunk_duration=0.5,  # Duration of audio chunks
    started_talking_threshold=0.2,  # Threshold to detect start of speech
    speech_threshold=0.1  # General speech detection threshold
)

Integration with FastAPI

To integrate the voice assistant with a FastAPI app:

from fastapi import FastAPI
from voice_assistant.app import create_voice_assistant_stream

app = FastAPI()
stream = create_voice_assistant_stream()
stream.mount(app)

Troubleshooting

Issue: Models fail to load Solution: Verify the correct paths to your model files and ensure they're accessible.

Issue: Speech recognition is inaccurate Solution: Try speaking clearly and ensure your microphone is properly configured.

Issue: High latency in responses Solution: Consider using a more powerful GPU or reducing the model parameters.

License

This project uses the same license as the Fonada TTS system.

Voice Assistant Monitoring

This document describes how to set up monitoring for the Voice Assistant application. There are two options available:

Option 1: Streamlit Dashboard (Lightweight)

A lightweight, real-time monitoring dashboard built with Streamlit.

Installation

  1. Install required packages:

    pip install streamlit pandas plotly
  2. Run the monitoring dashboard:

    streamlit run monitor.py

The dashboard will be available at http://localhost:8501 and includes:

  • Real-time log viewing
  • Request timeline visualization
  • Log level distribution
  • Filtering by request ID and log level
  • Auto-refresh functionality

Option 2: Graylog (Enterprise-grade)

A more comprehensive logging and monitoring solution.

Installation

  1. Install Graylog prerequisites (MongoDB and Elasticsearch):

    sudo apt-get install mongodb-org elasticsearch
  2. Download and install Graylog:

    wget https://packages.graylog2.org/repo/packages/graylog-4.0-repository_latest.deb
    sudo dpkg -i graylog-4.0-repository_latest.deb
    sudo apt-get update
    sudo apt-get install graylog-server

Features

Streamlit Dashboard

  • Real-time log viewing
  • Interactive visualizations
  • Request timeline
  • Log level distribution
  • Filter by request ID and log level
  • Auto-refresh capability
  • Lightweight and easy to set up

Graylog

  • Enterprise-grade log management
  • Advanced search capabilities
  • Custom dashboards
  • Alerts and notifications
  • Log retention policies
  • Role-based access control

Usage

  1. Start your voice assistant application:

    python app.py
  2. Choose your preferred monitoring solution:

For Streamlit dashboard:

streamlit run monitor.py

For Graylog:

  • Access the Graylog web interface at http://your-server:9000
  • Default credentials: admin/admin (change on first login)

Monitoring Metrics

The monitoring solutions track:

  • Total number of requests
  • Active requests (last 5 minutes)
  • Error rates
  • Log levels distribution
  • Request timelines
  • Detailed log messages

Troubleshooting

If you encounter issues:

  1. Streamlit Dashboard:
  2. Ensure the log file exists and is readable
  3. Check if required packages are installed
  4. Verify the correct Python version

  5. Graylog:

  6. Verify MongoDB and Elasticsearch are running
  7. Check Graylog service status
  8. Review system logs for errors