EvoSpikeNet NVIDIA NGC Jupyter Notebook Setup Guide
[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).
This guide shows you how to run EvoSpikeNet using NVIDIA NGC's prebuilt Jupyter Notebook containers.
overview
NVIDIA NGC provides optimized containers with CUDA, cuDNN, and deep learning frameworks preinstalled. It is based on the PyTorch Jupyter Notebook container and installs EvoSpikeNet on top of it.
Prerequisites
- NVIDIA GPU with CUDA support
- Docker installed
- NGC account (optional, but recommended for latest images)
- Minimum 16GB RAM recommended
Quick start
1. Clone the repository
git clone https://github.com/moonlight-tech/EvoSpikeNet.git
cd EvoSpikeNet
2. Environment settings
Create a .env file for NGC-specific settings:
# NGC settings
NGC_PYTORCH_TAG=26.01-py3
NGC_BASE_IMAGE=nvcr.io/nvidia/pytorch:${NGC_PYTORCH_TAG}
# Jupyter settings
JUPYTER_PORT=8888
JUPYTER_TOKEN=evospikenet-ngc
# EvoSpikeNet settings
EVOSPIKENET_API_KEY=your-api-key-here
BUILD_TARGET=development
ENABLE_GPU=true
3. Launching in NGC container
# Build and run an NGC-based notebook
docker-compose -f docker-compose.ngc.yml up -d ngc-notebook
# or full stack execution (if you also need API/DB services)
docker-compose -f docker-compose.ngc.yml --profile full up -d
4. Accessing Jupyter Notebooks
Access the following URL in your browser: http://localhost:8888
Use token: evospikenet-ngc (or value set in JUPYTER_TOKEN)
Configuration file
Dockerfile.ngc
This Dockerfile extends the NGC PyTorch container and adds the EvoSpikeNet dependency.
docker-compose.ngc.yml
This is a Docker Compose configuration specialized for NGC containers.
Available NGC images
NVIDIA NGC offers multiple PyTorch Jupyter containers:
nvcr.io/nvidia/pytorch:26.01-py3- Latest stable versionnvcr.io/nvidia/pytorch:25.12-py3- previous versionnvcr.io/nvidia/pytorch:24.01-py3- LTS version
Choose the appropriate tag based on CUDA driver compatibility.
GPU memory requirements
- Minimum: 8GB GPU RAM
- Recommended: 16GB+ GPU RAM
- For large-scale training: 24GB+ GPU RAM
troubleshooting
If the container doesn't start
Check GPU availability:```bash nvidia-smi docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
### If Jupyter doesn't load
Check container logs:```bash
docker-compose -f docker-compose.ngc.yml logs ngc-notebook
Import error
Ensure all dependencies are installed:```bash docker-compose -f docker-compose.ngc.yml exec ngc-notebook pip list | grep torch
## Advanced usage
### Custom NGC Image
To use a different NGC image, change `NGC_BASE_IMAGE` in the `.env` file.
### Development mode
For development with live code reload:
```bash
docker-compose -f docker-compose.ngc.yml up -d ngc-dev
Full stack deployment
Running the complete EvoSpikeNet stack on NGC:
# Start all services
docker-compose -f docker-compose.ngc.yml --profile full up -d
# Services started:
# - ngc-notebook: Jupyter using NGC PyTorch (port 8888)
# - api: FastAPI backend (port 8000)
# - frontend: Web UI (port 8050)
# - postgres: database
# - milvus: vector database
# - elasticsearch: search engine
# - zenoh-router: Distributed messaging
Deployment in cloud environment
Pushing images to NGC Container Registry
Push your customized EvoSpikeNet NGC image to the NVIDIA NGC Container Registry to make it available in your cloud environment.
1. Build the image
# Run in the root directory of the EvoSpikeNet repository
docker build -f Dockerfile.ngc -t evospikenet-ngc:latest .
# or specify a specific NGC PyTorch version
docker build -f Dockerfile.ngc \
--build-arg NGC_PYTORCH_TAG=26.01-py3 \
-t evospikenet-ngc:26.01 .
2. Login to NGC Registry
# Login using NGC API key
docker login nvcr.io
# Username: $oauthtoken
# Password: <your-ngc-api-key>
NGC API keys can be generated at NGC Dashboard.
3. Tagging and pushing images
# Tagged with NGC organization name and image name
docker tag evospikenet-ngc:latest nvcr.io/<your-org>/evospikenet:latest
docker tag evospikenet-ngc:latest nvcr.io/<your-org>/evospikenet:26.01
# Push to NGC Registry
docker push nvcr.io/<your-org>/evospikenet:latest
docker push nvcr.io/<your-org>/evospikenet:26.01
Running on AWS
Running on EC2 instance (GPU enabled)
# Assuming NVIDIA Container Toolkit is installed
# Pull and run NGC image
docker run --gpus all -d \
-p 8888:8888 \
-v /home/ubuntu/evospikenet-data:/home/appuser/app/saved_models \
-e JUPYTER_TOKEN=your-secure-token \
-e EVOSPIKENET_API_KEY=your-api-key \
--name evospikenet-ngc \
nvcr.io/<your-org>/evospikenet:latest \
jupyter lab --ip=0.0.0.0 --port=8888 --no-browser \
--ServerApp.token=your-secure-token \
--ServerApp.allow_origin='*'
# Check the log
docker logs -f evospikenet-ngc
Running on Amazon ECS
Example ECS task definition (JSON):
{
"family": "evospikenet-ngc",
"networkMode": "awsvpc",
"requiresCompatibilities": ["EC2"],
"containerDefinitions": [
{
"name": "evospikenet",
"image": "nvcr.io/<your-org>/evospikenet:latest",
"memory": 16384,
"cpu": 4096,
"essential": true,
"portMappings": [
{
"containerPort": 8888,
"protocol": "tcp"
}
],
"environment": [
{"name": "JUPYTER_TOKEN", "value": "your-secure-token"},
{"name": "EVOSPIKENET_API_KEY", "value": "your-api-key"},
{"name": "NVIDIA_VISIBLE_DEVICES", "value": "all"}
],
"resourceRequirements": [
{
"type": "GPU",
"value": "1"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/evospikenet-ngc",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}
Running on Azure
Running on Azure Container Instances (GPU enabled)
# Create a container with Azure CLI
az container create \
--resource-group evospikenet-rg \
--name evospikenet-ngc \
--image nvcr.io/<your-org>/evospikenet:latest \
--cpu 4 \
--memory 16 \
--gpu-count 1 \
--gpu-sku V100 \
--ports 8888 \
--dns-name-label evospikenet-ngc \
--environment-variables \
JUPYTER_TOKEN=your-secure-token \
EVOSPIKENET_API_KEY=your-api-key \
NVIDIA_VISIBLE_DEVICES=all \
--restart-policy OnFailure
# Check container status
az container show \
--resource-group evospikenet-rg \
--name evospikenet-ngc \
--query "{Status:instanceView.state,IP:ipAddress.ip,Ports:ipAddress.ports}" \
--output table
# Check the log
az container logs \
--resource-group evospikenet-rg \
--name evospikenet-ngc
Running on Azure Kubernetes Service (AKS)
Deployment manifest (evospikenet-ngc-deployment.yaml):
apiVersion: apps/v1
kind: Deployment
metadata:
name: evospikenet-ngc
spec:
replicas: 1
selector:
matchLabels:
app: evospikenet-ngc
template:
metadata:
labels:
app: evospikenet-ngc
spec:
containers:
- name: evospikenet
image: nvcr.io/<your-org>/evospikenet:latest
ports:
- containerPort: 8888
env:
- name: JUPYTER_TOKEN
valueFrom:
secretKeyRef:
name: evospikenet-secrets
key: jupyter-token
- name: EVOSPIKENET_API_KEY
valueFrom:
secretKeyRef:
name: evospikenet-secrets
key: api-key
resources:
limits:
nvidia.com/gpu: 1
memory: "16Gi"
cpu: "4"
requests:
nvidia.com/gpu: 1
memory: "8Gi"
cpu: "2"
volumeMounts:
- name: models-storage
mountPath: /home/appuser/app/saved_models
volumes:
- name: models-storage
persistentVolumeClaim:
claimName: evospikenet-models-pvc
---
apiVersion: v1
kind: Service
metadata:
name: evospikenet-ngc-service
spec:
type: LoadBalancer
ports:
- port: 8888
targetPort: 8888
selector:
app: evospikenet-ngc
Deployment steps:
# Creating a secret
kubectl create secret generic evospikenet-secrets \
--from-literal=jupyter-token=your-secure-token \
--from-literal=api-key=your-api-key
# Deployment
kubectl apply -f evospikenet-ngc-deployment.yaml
# Check status
kubectl get pods -l app=evospikenet-ngc
kubectl get svc evospikenet-ngc-service
Running on Google Cloud Platform (GCP)
Running on Compute Engine (GPU-enabled VM)
# Assuming that NVIDIA Container Toolkit is installed after connecting to the GCE instance via SSH
# Run NGC image
docker run --gpus all -d \
-p 8888:8888 \
-v /mnt/disks/evospikenet-data:/home/appuser/app/saved_models \
-e JUPYTER_TOKEN=your-secure-token \
-e EVOSPIKENET_API_KEY=your-api-key \
--name evospikenet-ngc \
nvcr.io/<your-org>/evospikenet:latest \
jupyter lab --ip=0.0.0.0 --port=8888 --no-browser \
--ServerApp.token=your-secure-token
# Setting firewall rules (first time only)
gcloud compute firewall-rules create allow-jupyter \
--allow tcp:8888 \
--source-ranges 0.0.0.0/0 \
--description "Allow Jupyter access"
Running on Google Kubernetes Engine (GKE)
# Creating a GKE cluster (GPU enabled)
gcloud container clusters create evospikenet-cluster \
--accelerator type=nvidia-tesla-v100,count=1 \
--machine-type n1-standard-4 \
--num-nodes 2 \
--zone us-central1-a
# Installing the NVIDIA GPU device plugin
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded.yaml
# Creating a secret
kubectl create secret generic evospikenet-secrets \
--from-literal=jupyter-token=your-secure-token \
--from-literal=api-key=your-api-key
# Deploying EvoSpikeNet
kubectl run evospikenet-ngc \
--image=nvcr.io/<your-org>/evospikenet:latest \
--port=8888 \
--limits='nvidia.com/gpu=1' \
--env="JUPYTER_TOKEN=your-secure-token" \
--env="EVOSPIKENET_API_KEY=your-api-key"
# Publishing a service
kubectl expose pod evospikenet-ngc \
--type=LoadBalancer \
--port=8888 \
--target-port=8888
# Check external IP
kubectl get service evospikenet-ngc
Advanced deployment in Kubernetes
Production-ready Kubernetes manifest (evospikenet-production.yaml):
apiVersion: v1
kind: Namespace
metadata:
name: evospikenet
---
apiVersion: v1
kind: ConfigMap
metadata:
name: evospikenet-config
namespace: evospikenet
data:
LOG_LEVEL: "INFO"
PYTHONPATH: "/home/appuser/app"
TZ: "Asia/Tokyo"
---
apiVersion: v1
kind: Secret
metadata:
name: evospikenet-secrets
namespace: evospikenet
type: Opaque
stringData:
jupyter-token: "your-secure-token-here"
api-key: "your-secure-api-key-here"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: evospikenet-models-pvc
namespace: evospikenet
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: standard
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: evospikenet-ngc
namespace: evospikenet
spec:
replicas: 1
selector:
matchLabels:
app: evospikenet-ngc
template:
metadata:
labels:
app: evospikenet-ngc
spec:
containers:
- name: evospikenet
image: nvcr.io/<your-org>/evospikenet:latest
imagePullPolicy: Always
ports:
- containerPort: 8888
name: jupyter
env:
- name: JUPYTER_TOKEN
valueFrom:
secretKeyRef:
name: evospikenet-secrets
key: jupyter-token
- name: EVOSPIKENET_API_KEY
valueFrom:
secretKeyRef:
name: evospikenet-secrets
key: api-key
envFrom:
- configMapRef:
name: evospikenet-config
resources:
limits:
nvidia.com/gpu: 1
memory: "32Gi"
cpu: "8"
requests:
nvidia.com/gpu: 1
memory: "16Gi"
cpu: "4"
volumeMounts:
- name: models-storage
mountPath: /home/appuser/app/saved_models
- name: logs-storage
mountPath: /home/appuser/app/logs
livenessProbe:
httpGet:
path: /login
port: 8888
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /login
port: 8888
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
volumes:
- name: models-storage
persistentVolumeClaim:
claimName: evospikenet-models-pvc
- name: logs-storage
emptyDir: {}
nodeSelector:
cloud.google.com/gke-accelerator: nvidia-tesla-v100 # Select GPU-enabled node
---
apiVersion: v1
kind: Service
metadata:
name: evospikenet-ngc-service
namespace: evospikenet
spec:
type: LoadBalancer
ports:
- port: 8888
targetPort: 8888
protocol: TCP
name: jupyter
selector:
app: evospikenet-ngc
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: evospikenet-ngc-hpa
namespace: evospikenet
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: evospikenet-ngc
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Deployment steps:
# Applying the manifest
kubectl apply -f evospikenet-production.yaml
# Verifying the deployment
kubectl get all -n evospikenet
# Check the log
kubectl logs -n evospikenet -l app=evospikenet-ngc -f
# Get external IP
kubectl get svc -n evospikenet evospikenet-ngc-service
Security settings for production environment
Secure management of environment variables
# Using Kubernetes Secrets
kubectl create secret generic evospikenet-secrets \
--from-literal=jupyter-token=$(openssl rand -base64 32) \
--from-literal=api-key=$(openssl rand -base64 32) \
--namespace evospikenet
# Using AWS Secrets Manager (ECS/EKS)
aws secretsmanager create-secret \
--name evospikenet/jupyter-token \
--secret-string $(openssl rand -base64 32)
aws secretsmanager create-secret \
--name evospikenet/api-key \
--secret-string $(openssl rand -base64 32)
Enabling HTTPS
Example using Nginx Ingress Controller:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: evospikenet-ingress
namespace: evospikenet
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
tls:
- hosts:
- evospikenet.yourdomain.com
secretName: evospikenet-tls
rules:
- host: evospikenet.yourdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: evospikenet-ngc-service
port:
number: 8888
Network policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: evospikenet-network-policy
namespace: evospikenet
spec:
podSelector:
matchLabels:
app: evospikenet-ngc
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: evospikenet
ports:
- protocol: TCP
port: 8888
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 443
- protocol: TCP
port: 80
Monitoring and logging
Configuring Prometheus metrics
apiVersion: v1
kind: Service
metadata:
name: evospikenet-metrics
namespace: evospikenet
labels:
app: evospikenet-ngc
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"
spec:
selector:
app: evospikenet-ngc
ports:
- name: metrics
port: 9090
targetPort: 9090
Log aggregation (to Elasticsearch via Fluentd)
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
namespace: evospikenet
data:
fluent.conf: |
<source>
@type tail
path /home/appuser/app/logs/*.log
pos_file /var/log/fluentd-evospikenet.pos
tag evospikenet.*
<parse>
@type json
</parse>
</source>
<match evospikenet.**>
@type elasticsearch
host elasticsearch.logging.svc.cluster.local
port 9200
logstash_format true
logstash_prefix evospikenet
</match>
Cost optimization
Automatic stop/start by scheduling
Example using AWS Lambda + EventBridge:
import boto3
def lambda_handler(event, context):
ecs = boto3.client('ecs')
action = event.get('action', 'stop')
cluster = 'evospikenet-cluster'
service = 'evospikenet-ngc-service'
if action == 'start':
# Set number of tasks for service to 1
ecs.update_service(
cluster=cluster,
service=service,
desiredCount=1
)
elif action == 'stop':
# Set number of tasks for service to 0
ecs.update_service(
cluster=cluster,
service=service,
desiredCount=0
)
return {'statusCode': 200, 'body': f'Service {action}ped'}
Using Spot Instances/Preemptible VMs
Using Spot VMs with GKE:
gcloud container node-pools create evospikenet-spot-pool \
--cluster=evospikenet-cluster \
--spot \
--accelerator type=nvidia-tesla-v100,count=1 \
--machine-type=n1-standard-4 \
--num-nodes=2 \
--zone=us-central1-a
Performance optimization
GPU usage
NGC containers are optimized for NVIDIA GPUs. Monitor GPU usage:
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU count: {torch.cuda.device_count()}")
print(f"Current GPU: {torch.cuda.get_device_name()}")
Memory management
Adjust PyTorch memory settings for large models:
import torch
torch.cuda.set_per_process_memory_fraction(0.8) # Uses 80% of GPU memory
Multi-GPU settings
When using multiple GPUs in a cloud environment:
import torch
import torch.nn as nn
# Using DataParallel
if torch.cuda.device_count() > 1:
model = nn.DataParallel(model)
print(f"Using {torch.cuda.device_count()} GPUs")
model.to('cuda')
Security notes
- Change the default Jupyter token in production environments
- Use strong API keys for EvoSpikeNet services
- Consider network isolation for production deployments
support
For NGC-specific issues, see: - NVIDIA NGC Documentation - PyTorch NGC Container
For EvoSpikeNet issues, check the documentation in the main repository.