
sstcaiteam/open-webuiWe're excited to announce a powerful new enhancement to Open WebUI's Speech-to-Text (STT) capabilities: Speaker Diarization (who spoke when), powered by pyannote/speaker-diarization. This feature intelligently identifies and differentiates individual speakers within your audio files, providing much clearer and more organized transcriptions, especially for meetings, interviews, or multi-person discussions.
Using this new feature is straightforward:
Run Open WebUI with docker:
docker run -d -p 3000:8080 --gpus all -v open-webui:/app/backend/data --name open-webui-diar sstcaiteam/open-webui:v0.6.34-diar-cuda
Enable the Feature:
Upload Your Audio:
.mp3 and .wav) directly into the Open WebUI interface.View Your Enhanced Transcript:
Example Output:
00:00:00-00:00:10 SPEAKER_00: Hello everyone, thanks for joining today's meeting. 00:00:13-00:00:16 SPEAKER_02: Hi! Happy to be here. 00:00:18-00:00:24 SPEAKER_00: We'll be discussing the Q3 marketing strategy. 00:00:26-00:00:31 SPEAKER_03: I've prepared some initial thoughts on that.
We believe this feature will significantly enhance your experience with Open WebUI's transcription services.




manifest unknown 错误
TLS 证书验证失败
DNS 解析超时
410 错误:版本过低
402 错误:流量耗尽
身份认证失败错误
429 限流错误
凭证保存错误
来自真实用户的反馈,见证轩辕镜像的优质服务