====== Speech to text ====== stt, распознавание речи https://github.com/SergeyShk/Speech-to-Text-Russian Модели: https://alphacephei.com/vosk/models, в архиве файл ''graph/HCLG.fst''. ===== Kaldi ===== База многих STT-проектов\\ https://kaldi-asr.org/\\ https://github.com/kaldi-asr/kaldi\\ https://hub.docker.com/r/kaldiasr/kaldi\\ Другие варианты: DeepSpeech, Wav2letter, SpeechBrain, [[https://github.com/coqui-ai/STT|Coqui STT]], Vosk. ===== Vosk ===== https://alphacephei.com/vosk/index usage: vosk-transcriber.exe [-h] [--model MODEL] [--list-models] [--list-languages] [--model-name MODEL_NAME] [--lang LANG] [--input INPUT] [--output OUTPUT] [--output-type OUTPUT_TYPE] [--log-level LOG_LEVEL] Transcribe audio file and save result in selected format optional arguments: -h, --help show this help message and exit --model MODEL, -m MODEL model path --list-models list available models --list-languages list available languages --model-name MODEL_NAME, -n MODEL_NAME select model by name --lang LANG, -l LANG select model by language --input INPUT, -i INPUT audiofile --output OUTPUT, -o OUTPUT optional output filename path --output-type OUTPUT_TYPE, -t OUTPUT_TYPE optional arg output data type --log-level LOG_LEVEL logging level ===== ffmpeg asr filter ===== This filter uses PocketSphinx for speech recognition. To enable compilation of this filter, you need to configure FFmpeg with ''%%--enable-pocketsphinx%%'' https://ffmpeg.org/ffmpeg-all.html#asr