Projects with this topic
-
https://github.com/Picovoice/falcon On-device speaker diarization powered by deep learning
Updated -
🔧 🔗 https://github.com/alphacep/vosk-serverWebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Updated -
🔧 🔗 https://github.com/modelscope/FunClip Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.Updated -
🔧 🔗 https://github.com/alphacep/vosk-unity-asr Automatic Speech Recognition in Unity using Vosk libraryUpdated -
🔧 🔗 https://github.com/speechbrain/speechbrain.github.io The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.Updated -
https://github.com/homebrewltd/AudioBench AudioBench: A Universal Benchmark for Audio Large Language Models
🔗 https://arxiv.org/abs/2406.16020Updated -
🔧 🔗 https://github.com/markovka17/digit-recognition A small model for recognition of digits in audio clipsUpdated -
🔧 🔗 https://github.com/Cinnamon/whisper-jargon[SIGDIAL'24] Improving Speech Recognition with Jargon Injection
Updated -
https://github.com/YuanGongND/ltu Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
Updated -
https://github.com/coqui-ai/STT
🐸 STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.Updated -
https://github.com/coqui-ai/stt-model-manager Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zoo
Updated -
https://github.com/coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech TechnologiesUpdated -
https://github.com/neonbjb/ocotillo Performant and accurate speech recognition built on Pytorch
Updated -
https://github.com/Picovoice/leopard
On-device speech-to-text engine powered by deep learning
Updated