From 1fccc97fb99c2374a92346080cc8ce3152d44f57 Mon Sep 17 00:00:00 2001 From: Vaibhav Srivastav <vaibhavs10@gmail.com> Date: Wed, 14 Aug 2024 11:54:42 +0200 Subject: [PATCH] Minor doc fix. --- README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 75fe28e..5dff234 100644 --- a/README.md +++ b/README.md @@ -19,8 +19,8 @@ ### Structure This repository implements a speech-to-speech cascaded pipeline with consecutive parts: 1. **Voice Activity Detection (VAD)**: [silero VAD v5](https://github.com/snakers4/silero-vad) -2. **Speech to Text (STT)**: Whisper models (including distilled versions) -3. **Language Model (LM)**: Any instruct model available on the [Hugging Face Hub](https://huggingface.co/models)! 🤗 +2. **Speech to Text (STT)**: Whisper checkpoints (including [distilled versions](https://huggingface.co/distil-whisper)) +3. **Language Model (LM)**: Any instruct model available on the [Hugging Face Hub](https://huggingface.co/models?pipeline_tag=text-generation&sort=trending)! 🤗 4. **Text to Speech (TTS)**: [Parler-TTS](https://github.com/huggingface/parler-tts) ### Modularity @@ -70,6 +70,8 @@ python s2s_pipeline.py --recv_host localhost --send_host localhost python listen_and_play.py --host localhost ``` +You can pass `--device mps` to run it locally on a Mac. + ## Command-line Usage ### Model Parameters -- GitLab