Skip to content

Backends: MoveNet / Whisper / STGCN — Connection steps

[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).

This document summarizes the steps to build an environment for real model backends (MoveNet/MediaPipe, Whisper/faster_whisper, Torch-STGCN) and utilize the *_real backend of evospikenet.video_analysis.backends.

The basic dependencies are in requirements.txt, but to enable the actual backend, install the following additionally:

  • MoveNet / MediaPipe (Pose)
python3 -m pip install mediapipe
  • Whisper (faster_whisper recommended)
python3 -m pip install faster-whisper
  • Torch + STGCN model (recommended: CPU or CUDA environment)
python3 -m pip install torch
# Prepare the STGCN model in TorchScript format and set the path to the environment variable VIDEO_ANALYSIS_STGCN_MODEL.
export VIDEO_ANALYSIS_STGCN_MODEL=/path/to/stgcn_model.pt

2) Environment variables/settings

  • VIDEO_ANALYSIS_WHISPER_MODEL : Whisper model name (e.g. tiny)
  • VIDEO_ANALYSIS_WHISPER_DEVICE : cpu or cuda.
  • VIDEO_ANALYSIS_STGCN_MODEL : TorchScript model path.

Or add settings to Docs/video_analysis_config.yaml or settings.*.yaml.

3) Operation confirmation (smoke)

You can list the available backends by running the following from the repository root:

python3 tools/smoke_backends.py

The output returns the available flag for each pose / action / asr.

4) Discovery in CI

Since it is not always possible to prepare a real GPU for CI, it is recommended that tools/smoke_backends.py only reports whether a real backend is available and does not make it a requirement.