Backends: MoveNet / Whisper / STGCN — Connection steps
[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).
This document summarizes the steps to build an environment for real model backends (MoveNet/MediaPipe, Whisper/faster_whisper, Torch-STGCN) and utilize the *_real backend of evospikenet.video_analysis.backends.
1) Required packages (python 3.9+ recommended)
The basic dependencies are in requirements.txt, but to enable the actual backend, install the following additionally:
- MoveNet / MediaPipe (Pose)
python3 -m pip install mediapipe
- Whisper (faster_whisper recommended)
python3 -m pip install faster-whisper
- Torch + STGCN model (recommended: CPU or CUDA environment)
python3 -m pip install torch
# Prepare the STGCN model in TorchScript format and set the path to the environment variable VIDEO_ANALYSIS_STGCN_MODEL.
export VIDEO_ANALYSIS_STGCN_MODEL=/path/to/stgcn_model.pt
2) Environment variables/settings
VIDEO_ANALYSIS_WHISPER_MODEL: Whisper model name (e.g.tiny)VIDEO_ANALYSIS_WHISPER_DEVICE:cpuorcuda.VIDEO_ANALYSIS_STGCN_MODEL: TorchScript model path.
Or add settings to Docs/video_analysis_config.yaml or settings.*.yaml.
3) Operation confirmation (smoke)
You can list the available backends by running the following from the repository root:
python3 tools/smoke_backends.py
The output returns the available flag for each pose / action / asr.
4) Discovery in CI
Since it is not always possible to prepare a real GPU for CI, it is recommended that tools/smoke_backends.py only reports whether a real backend is available and does not make it a requirement.