Service Setup

The service requires the following:

  • zookeeper and kafka

  • postgresql

  • minio for storage and file access

  • the zeroth master service

  • the tts-api service

  • the TTS worker service

  • the TTS encoder service

all of the above are available as docker images, either publicly (zookeeper, kafka, minio) or from Atlas Labs.

models are also needed. they should be provided from Atlas Labs and extracted to host machine.

the following assumes a working setup of master, minio, postgresql and kafka + zookeeper (already in GitBook Manual), and shows how to add the tts-specific components. the services will be launched by docker-compose.

Encoder Compose Sample

this docker-compose snippet will launch one encoder instance on GPU:0 . please check the latest image version.

version: '2.3'
services:
  encoder0:
    image: 161969600347.dkr.ecr.ap-northeast-2.amazonaws.com/zeroth/tts/encoder:0.2.2
    runtime: nvidia
    restart: always
    environment:
      CUDA_LAUNCH_BLOCKING: 0
      OMP_NUM_THREAD: 4
      NVIDIA_VISIBLE_DEVICES: 0
      ENCODER_NUM_WORKERS: 100
      ENCODER_SOCKET_PORT: 7878
      ENCODER_USE_MODEL: 16khz
      ENCODER_RESAMPLER: none
    volumes:
      - <local/path/to/extracted/models>:/workspace/server/model
      - <local/path/to/logs>/tts-encoder:/workspace/server/logs/encoder

take note of the following settings:

  • version is set to 2.3 for compatibility with nvidia-docker2

  • runtime is specified as nvidia to allow for GPU inference

  • NVIDIA_VISIBLE_DEVICES specifies which GPU to use on the local machine.

  • ENCODER_NUM_WORKERS must match or exceed the worker's NUMWORKERS

  • ENCODER_SOCKET_PORT must match the worker's WORKER_SOCKET_PORT

  • ENCODER_USE_MODEL refers to the model set to use. this is a subdirectory within the mounted model directory

  • ENCODER_RESAMPLER determines which resampling to do in the encoder. with a gstreamer-enabled worker, set to none.

  • volumes: mount the model base directory to the container's /workspace/server/model directory. this directory should contain a subdirectory corresponding to the ENCODER_USE_MODEL parameter.

  • volumes: mount a local directory to the container's /workspace/server/logs/encoder directory to access logs locally.

see the "Container Configuration" section below for a full list of configuration environment variables.

Worker Compose Sample

version: '2.3'
services:
  encoder0:
    <see above>

  worker0:
    image: 161969600347.dkr.ecr.ap-northeast-2.amazonaws.com/zeroth/tts/worker:0.2.0
    restart: always
    environment:
      WORKER_MASTER_ENDPOINT: "ws://<master0_addr>:<port>/ws/worker/tts?model=<model_name>,ws://<master1_addr>:<port>/ws/worker/tts?model=<model_name>"
      WORKER_SOCKET_ADDR: encoder0
      WORKER_SOCKET_PORT: 7878
      WORKER_SILENCE_TIMEOUT: 60
      NUMWORKERS: 100
    volumes:
      - <local/path/to/logs>/tts-worker:/home/root/zeroth-tts-worker/log/tts
    depends_on:
      - encoder0

take note of the following settings:

  • WORKER_MASTER_ENDPOINT should contain a comma-separated list of master endpoints

  • WORKER_SOCKET_ADDR should be service name of corresponding encoder (for use with docker networking).

  • WORKER_SOCKET_PORT should be port of corresponding encoder. (see encoder ENCODER_SOCKET_PORT)

  • NUMWORKERS must be less than or equal to the encoder's ENCODER_NUM_WORKERS

  • volumes: mount container's /home/root/zeroth-tts-worker/log/tts to local to access worker log files.

see the "Container Configuration" section below for a full list of configuration environment variables.

Mult-GPU setups

for multi-GPU machines, it is recommended to initiate an encoder and worker container for each GPU.

for each encoder container:

  • set container name to unique name, (e.g. encoder0, encoder1, encoder2, ...)

  • set NVIDIA_VISIBLE_DEVICES to a unique GPU index (e.g. 0, 1, 2, ...)

  • set ENCODER_SOCKET_PORT to a unique port (e.g. 7878, 7879, 7880, ...)

for each worker container:

  • set container name to unique name, (e.g. worker0, worker1, worker2, ...)

  • set WORKER_SOCKET_PORT to port of corresponding encoder (e.g. 7878, 7879, 7880, ...)

  • set depends_on to corresponding encoder name (e.g. encoder0, encoder1, encoder2, ...)

TTS API Compose Sample

the output format is specified by:

  • AUDIO_ENCODING : the sox command used to process the PCM stream; notice default sample rate is 16000 and encoding is mp3

  • AUDIO_EXTENSION : determines the file extension of the saved file

version: '3'
services:
  tts-api:
    image: 161969600347.dkr.ecr.ap-northeast-2.amazonaws.com/zeroth/tts-api:1.0.1
    restart: always
    ports:
      - "8083:8080"
    environment:
      TZ: Asia/Seoul
      SERVER_PORT: 8080
      SPRING_DATASOURCE_URL: jdbc:postgresql://<addr_0>:<port_0>,<addr_1>:<port_1>/zeroth
      SPRING_DATASOURCE_USERNAME: <user>
      SPRING_DATASOURCE_PASSWORD: <password>
      SPRING_KAFKA_CONSUMER_BOOTSTRAP-SERVERS: <kafka_addr>:9092
      KEYCLOAK_AUTH-SERVER-URL: http://<keycloak_addr_0>:<keycloak_port_0>/auth
      STORAGE_URL: http://<minio_addr_0>:<minio_port_0>
      STORAGE_ACCESS-KEY: <minio_user>
      STORAGE_SECRET-KEY: <minio_password>
      STORAGE_BASE-URL: http://<minio_base_addr_0>:<minio_base_port_0>
      STORAGE_BUCKET: tts-audio
      STORAGE_TMP-DIR: /tmp/
      LOGGING_FILE_NAME: /logs/tts-api.log
      STORAGE_WEB-AUDIO-URL: http://<minio_base_addr_0>:<minio_base_port_0>/zeroth-ee-audio/
      AUDIO_DURATION: /usr/bin/soxi -D %s
      AUDIO_ENCODING: /usr/bin/sox -t s16 -b 16 -c 1 -r 16000 %s -t mp3 -r 16000 %s
      AUDIO_EXTENSION: .mp3
      ZEROTH_ENDPOINT: ws://master:3179/
      ZEROTH_REQUIRE-AUTHENTICATION: "true"
networks:
  default:
    external:
      name: master_default

Last updated

Was this helpful?