LinTO-platform-stt is the transcription service within the https://github.com/linto-ai/linto-platform-stack.
LinTO-platform-stt can either be used as a standalone transcription service or deployed within a micro-services infrastructure using a message broker connector.
To run the transcription models you'll need:
LinTO-Platform-STT accepts two kinds of models:
We provide home-cured models (v2) on dl.linto.ai. Or you can also use Vosk models available here.
The transcription service requires docker up and running.
The STT only entry point in task mode are tasks posted on a message broker. Supported message broker are RabbitMQ, Redis, Amazon SQS. On addition, as to prevent large audio from transiting through the message broker, STT-Worker use a shared storage folder (SHARED_FOLDER).
1- First step is to build or pull the image:
bashgit clone https://github.com/linto-ai/linto-platform-stt.git cd linto-platform-stt docker build . -t linto-platform-stt:latest
or
bashdocker pull lintoai/linto-platform-stt
2- Download the models
Have the acoustic and language model ready at AM_PATH and LM_PATH if you are using LinTO models. If you are using a Vosk model, have it ready at MODEL.
3- Fill the .env
bashcp .envdefault .env
| PARAMETER | DESCRIPTION | EXEMPLE |
|---|---|---|
| SERVICE_MODE | STT serving mode see Serving mode | http|task|websocket |
| MODEL_TYPE | Type of STT model used. | lin|vosk |
| ENABLE_STREAMING | Using http serving mode, enable the /streaming websocket route | true|false |
| SERVICE_NAME | Using the task mode, set the queue's name for task processing | my-stt |
| SERVICE_BROKER | Using the task mode, URL of the message broker | redis://my-broker:6379 |
| BROKER_PASS | Using the task mode, broker password | my-password |
| STREAMING_PORT | Using the websocket mode, the listening port for ingoing WS connexions. | 80 |
| CONCURRENCY | Maximum number of parallel requests | >1 |
!Serving Modes
STT can be used three ways:
Mode is specified using the .env value or environment variable SERVING_MODE.
bashSERVICE_MODE=http
The HTTP serving mode deploys a HTTP server and a swagger-ui to allow transcription request on a dedicated route.
The SERVICE_MODE value in the .env should be set to http.
bashdocker run --rm \ -p HOST_SERVING_PORT:80 \ -v AM_PATH:/opt/AM \ -v LM_PATH:/opt/LM \ --env-file .env \ linto-platform-stt:latest
This will run a container providing an HTTP API binded on the host HOST_SERVING_PORT port.
Parameters:
| Variables | Description | Example |
|---|---|---|
| HOST_SERVING_PORT | Host serving port | 80 |
| AM_PATH | Path to the acoustic model on the host machine mounted to /opt/AM | /my/path/to/models/AM_fr-FR_v2.2.0 |
| LM_PATH | Path to the language model on the host machine mounted to /opt/LM | /my/path/to/models/fr-FR_big-v2.2.0 |
| MODEL_PATH | Path to the model (using MODEL_TYPE=vosk) mounted to /opt/model | /my/path/to/models/vosk-model |
The HTTP serving mode connect a celery worker to a message broker.
The SERVICE_MODE value in the .env should be set to task.
LinTO-platform-stt can be deployed within the linto-platform-stack through the use of linto-platform-services-manager. Used this way, the container spawn celery worker waiting for transcription task on a message broker. LinTO-platform-stt in task mode is not intended to be launch manually. However, if you intent to connect it to your custom message's broker here are the parameters:
You need a message broker up and running at MY_SERVICE_BROKER.
bashdocker run --rm \ -v AM_PATH:/opt/AM \ -v LM_PATH:/opt/LM \ -v SHARED_AUDIO_FOLDER:/opt/audio \ --env-file .env \ linto-platform-stt:latest
Parameters:
| Variables | Description | Example |
|---|---|---|
| AM_PATH | Path to the acoustic model on the host machine mounted to /opt/AM | /my/path/to/models/AM_fr-FR_v2.2.0 |
| LM_PATH | Path to the language model on the host machine mounted to /opt/LM | /my/path/to/models/fr-FR_big-v2.2.0 |
| MODEL_PATH | Path to the model (using MODEL_TYPE=vosk) mounted to /opt/model | /my/path/to/models/vosk-model |
| SHARED_AUDIO_FOLDER | Shared audio folder mounted to /opt/audio | /my/path/to/models/vosk-model |
Websocket server's mode deploy a streaming transcription service only.
The SERVICE_MODE value in the .env should be set to websocket.
Usage is the same as the http streaming API
/healthcheck
Returns the state of the API
Method: GET
Returns "1" if healthcheck passes.
/transcribe
Transcription API
Return the transcripted text using "text/plain" or a json object when using "application/json" structure as followed:
json{ "text" : "This is the transcription", "words" : [ {"word":"This", "start": 0.123, "end": 0.453, "conf": 0.9}, ... ] "confidence-score": 0.879 }
/streaming
The /streaming route is accessible if the ENABLE_STREAMING environment variable is set to true.
The route accepts websocket connexions. Exchanges are structured as followed:
Connexion will be closed and the worker will be freed if no chunk are received for 10s.
/docs
The /docs route offers a OpenAPI/swagger interface.
STT-Worker accepts requests with the following arguments:
file_path: str, with_metadata: bool
Return format
On a successfull transcription the returned object is a json object structured as follow:
json{ "text" : "this is the transcription as text", "words": [ { "word" : "this", "start": 0.0, "end": 0.124, "conf": 1.0 }, ... ], "confidence-score": "" }
You can test you http API using curl:
bashcurl -X POST "http://YOUR_SERVICE:YOUR_PORT/transcribe" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "file=@YOUR_FILE;type=audio/x-wav"
This project is developped under the AGPLv3 License (see LICENSE).
探索更多轩辕镜像的使用方法,找到最适合您系统的配置方式
通过 Docker 登录认证访问私有仓库
无需登录使用专属域名
Kubernetes 集群配置 Containerd
K3s 轻量级 Kubernetes 镜像加速
VS Code Dev Containers 配置
Podman 容器引擎配置
HPC 科学计算容器配置
ghcr、Quay、nvcr 等镜像仓库
Harbor Proxy Repository 对接专属域名
Portainer Registries 加速拉取
Nexus3 Docker Proxy 内网缓存
需要其他帮助?请查看我们的 常见问题Docker 镜像访问常见问题解答 或 提交工单
manifest unknown
no matching manifest(架构)
invalid tar header(解压)
TLS 证书失败
DNS 超时
410 Gone 排查
402 与流量用尽
401 认证失败
429 限流
D-Bus 凭证提示
413 与超大单层
来自真实用户的反馈,见证轩辕镜像的优质服务