Projects with this topic
-
🔧 🔗 https://github.com/markovka17/dlaDeep learning for audio processing
Updated -
🔧 🔗 https://github.com/m-bain/whisperX WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)Updated -
huggingface.co/transformers https://github.com/huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.Updated -
🔧 🔗 https://github.com/SYSTRAN/faster-whisperFaster Whisper transcription with CTranslate2
Updated -
-
-
🔧 🔗 https://github.com/modelscope/FunASR A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processingUpdated -
https://github.com/Picovoice/cobra On-device voice activity detection (VAD) powered by deep learning
Updated -
-
https://github.com/Picovoice/cheetah On-device streaming speech-to-text engine powered by deep learning
Updated -
[
🔧 🔗 https://github.com/alphacep/vosk-api](https://github.com/alphacep/vosk-apiOffline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Updated -
🔧 🔗 https://github.com/openai/whisper Robust Speech Recognition via Large-Scale Weak SupervisionUpdated -
interspeech2019-tutorial
🔧 🔗 https://github.com/espnet/interspeech2019-tutorialINTERSPEECH 2019 Tutorial Materials
Updated -
UCLA Phonetic Corpus
🔧 🔗 https://github.com/xinjli/ucla-phonetic-corpusDataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION
Updated -
Allosaurus
🔧 🔗 https://github.com/xinjli/allosaurus Allosaurus is a pretrained universal phone recognizer for more than 2000 languagesUpdated -
https://github.com/Picovoice/falcon On-device speaker diarization powered by deep learning
Updated -
🔧 🔗 https://github.com/alphacep/vosk-serverWebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Updated -
🔧 🔗 https://github.com/modelscope/FunClip Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.Updated -
https://github.com/homebrewltd/AudioBench AudioBench: A Universal Benchmark for Audio Large Language Models
🔗 https://arxiv.org/abs/2406.16020Updated -
🔧 🔗 https://github.com/markovka17/digit-recognition A small model for recognition of digits in audio clipsUpdated