Projects with this topic
-
🔧 🔗 https://github.com/IAHispano/ApplioA simple, high-quality voice conversion tool focused on ease of use and performance
Updated -
🔧 🔗 https://github.com/modelscope/modelscopeModelScope: bring the notion of Model-as-a-Service to life.
Updated -
pyannote audio
🔧 🔗 https://github.com/pyannote/pyannote-audio Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker emUpdated -
🔧 🔗 https://github.com/alphacep/vosk-ttsText To Speech Synthesis with Vosk
Updated -
🔧 🔗 https://github.com/m-bain/whisperX WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)Updated -
🔧 🔗 https://github.com/bytedance/SALMONN SALMONN: Speech Audio Language Music Open Neural NetworkUpdated -
CTC Segmentation
🔧 🔗 https://github.com/espnet/ctc-segmentation Segment an audio file and obtain utterance alignments. (Python package)Updated -
Festvox
🔧 🔗 https://github.com/festvox/festvox Festvox voice building toolsUpdated -
flite
🔧 🔗 https://github.com/festvox/fliteA small fast portable speech synthesis system
Updated -
Festival
🔧 🔗 https://github.com/festvox/festivalFestival Speech Synthesis System
Updated -
Edinburgh Speech Tools
🔧 🔗 https://github.com/festvox/speech_tools CMU Edinburgh Speech ToolsUpdated -
UCLA Phonetic Corpus
🔧 🔗 https://github.com/xinjli/ucla-phonetic-corpusDataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION
Updated -
alqalign
🔧 🔗 https://github.com/xinjli/alqalign multilingual speech alignerUpdated -
Allosaurus
🔧 🔗 https://github.com/xinjli/allosaurus Allosaurus is a pretrained universal phone recognizer for more than 2000 languagesUpdated -
🔧 🔗 https://github.com/speechbrain/speechbrain.github.io The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.Updated -
🔧 🔗 https://github.com/huggingface/speech-to-speechSpeech To Speech: an effort for an open-sourced and modular GPT4-o
Updated -
🔧 🔗 https://github.com/elizaOS/LJSpeechToolsTools for making LJSpeech datasets
Updated -
https://github.com/homebrewltd/AudioBench AudioBench: A Universal Benchmark for Audio Large Language Models
🔗 https://arxiv.org/abs/2406.16020Updated -
🔧 🔗 https://github.com/suno-ai/bark Bark is a transformer-based text-to-audio model created by SunoUpdated -
https://github.com/Camb-ai/MARS5-TTS MARS5 speech model (TTS) from CAMB.AI www.camb.ai
Updated