Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
-
Updated
Jun 12, 2024 - Jupyter Notebook
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
VITS-based Voice Conversion focused on simplicity, quality and performance.
🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.
Drift-Lens: an Unsupervised Drift Detection Framework for Deep Learning Classifiers on Unstructured Data
Pictalk is an open-source application designed to assist individuals with speech impediments communicate effectively using pictograms and pictures
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
MATLAB implementation of the Speech Transmission Index for Public Address (STIPA) method for evaluating the speech transmission quality.
Tools for handling speech data in machine learning projects.
An opensource text-to-speech (TTS) voice building tool
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.
MARS5 speech model (TTS) from CAMB.AI
ModelScope: bring the notion of Model-as-a-Service to life.
An easy-to-use React.js component that leverages the Web Speech API to convert text to speech.
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Data manipulation and transformation for audio signal processing, powered by PyTorch
A collection of datasets for the purpose of emotion recognition/detection in speech.
Add a description, image, and links to the speech topic page so that developers can more easily learn about it.
To associate your repository with the speech topic, visit your repo's landing page and select "manage topics."