Model artifact list and front-end learning parameter mapping
[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).
Implementation notes (artifacts): See
docs/implementation/ARTIFACT_MANIFESTS.mdfor theartifact_manifest.jsonoutput by the training script and recommended CLI flags.
Creation date: 2025-12-21
This document determines and lists "implementable model artifact names" for the 24-node configuration defined previously. Also, clarify which artifact settings correspond to the main parameters specified in the front-end learning form (LLM/encoder learning).
1. Model artifact candidates by node
- Observation node (Sensing x4)
sensing-camera-preproc-v1(image preprocessing pipeline)sensing-audio-preproc-v1(audio preprocessing pipeline)-
sensing-iot-normalizer-v1(sensor normalization) -
Encoders x4 -Vision
vit-base16-embed-v1(ViT-base/16 → 768d embedding)resnet50-proj-v1(ResNet50 + projector → 512d) -Audiowav2vec2-base-embed-v1(wav2vec2-base → 512d)hubert-large-embed-v1(HuBERT-large → 1024d) -Textsbert-all-mpnet-v1(SBERT / mpnet-base-v2 → 768d)
-
Spiking / Event
snn-dvs-embed-v1(embedding for SNN/DVS)
-
Inference node (Inference x6)
- LM (short text/dialogue)
gpt-small-v1(GPT type small ~300M)gpt-medium-v1(GPT medium ~1.5B)gpt-large-v1(GPT type large ~6B) *Depends on demand -Classifier/Detectoryolox-s-intel-v1(YOLOX small / detector)fasterrcnn-res50-v1(Faster-RCNN Res50)
- Spiking-LM
spiking-lm-core-v1(spiking generation model)
-
Ensemble / Multimodal
multimodal-ensemble-v1(multimodal integration layer) -RAG-supportrag-lite-v1(retriever + generation wrapper)
-
Decision node (Decision x2)
- Planner
planner-rl-ppo-v1(PPO base planner)
-
Controller
motor-controller-dnn-v1(controller model)
-
Memory node (Memory x3)
-
Vector DB (separate from production artifacts: config / index templates)
milvus-schema-v1(vector DB schema definition) -Episode storageminio-log-schema-v1
-
Training node (Trainer x1)
-
trainer-ddp-manager-v1(distributed learning job management) -
Aggregation/arbitration node (Aggregator x2)
federator-agg-v1(safety aggregation protocol)-
result-aggregator-v1(output aggregation, reliability evaluation) -
Management/Utilities (Management x2)
auth-service-v1(API key/RBAC service)monitoring-stack-v1(Prometheus/Grafana/ELK settings)
2. Artifact naming convention (recommended meta)
- Format example:
<component>-<base-model>-<purpose>-v<major> - Example:
vision-vit-base16-embed-v1→ component=vision, base-model=vit-base16, purpose=embed, version=v1 - Metadata to record (artifact manifest):
artifact_name,model_version,base_model,task,embedding_dim,quantized(bool),precision(fp32/fp16/int8),training_config_hash,train_data_tags,license,created_at,node_type,privacy_level
Notes (implementation specifications):
- For generation scripts/training scripts, create artifact_manifest.json in the run save directory and include it in the upload ZIP.
- Flag names used in CLI/front end (existing implementation): --artifact-name, --precision, --quantize (store_true), --privacy-level, --node-type. These are reflected in the manifest.
- If artifact_name is not specified, it will be automatically generated and follow the recommended format prefix ({node_type}.{model_category}.{model_variant}.{run_name}.{timestamp}).
3. Frontend learning form parameters → artifact generation mapping
When triggering training on the frontend, it shows the key parameters the user will enter and which fields/settings will be reflected in the final generated artifact.
- Input parameters (example):
component(selection):artifact_nameprefix of the corresponding artifact (e.g.vision,audio,text,spiking)base_model(selection/text): Pretrained base (e.g.vit-base16,wav2vec2-base,gpt-small-v1) →base_modelmetatask(selection):embed/classification/lm-finetune/detection→taskmetaembedding_dim(number): embedding dimension →embedding_dimhidden_size,num_layers,num_heads(number): Architectural change → stored inmodel_configmax_seq_length/sample_rate/input_size: Model input/output specifications →input_specbatch_size,learning_rate,optimizer,epochs,weight_decay: Training settings →training_config(and generatetraining_config_hash)precision(selection):fp32/fp16/int8→precision,quantizedflagquantize(bool): If True, perform quantization post processing within the job → appendquantized=trueto the artifact name (e.g.-int8)checkpoint_interval(number): Checkpoint save frequency →checkpoint_policyaugmentations/preprocessing_profile: Data preprocessing →data_prep_profiletrain_data_tags(tag list): Which dataset was used →train_data_tagsmeta-
privacy_level(selection):none/dp/secure-agg→ Apply differential privacy and secure aggregation to learning jobs -
Mapping example (frontend input → generated artifact manifest):
-
component=vision,base_model=vit-base16,task=embed,embedding_dim=768,precision=fp16,quantize=false,batch_size=256,epochs=10→- artifact_name:
vision-vit-base16-embed-v1 - manifest: {"base_model":"vit-base16","task":"embed","embedding_dim":768,"precision":"fp16","training_config_hash":"
"}
- artifact_name:
-
component=inference,base_model=gpt-small-v1,task=lm-finetune,max_seq_length=1024,learning_rate=2e-5,epochs=3,quantize=int8→- artifact_name:
gpt-small-v1-lm-finetune-int8-v1 - manifest includes
quantized:true,precision:int8,input_spec:{max_seq_length:1024}
- artifact_name:
4. Notes on implementation on the front end side (short)
- When starting a training job, always calculate
training_config_hash(JSON normalization → SHA256) and link it to the artifact. This allows reproducibility and comparison. - Quantization options can be selected in the UI by running a
post-training-quantizestep in the job or by selecting Quantization Awareness (QAT) during training. - Privacy settings (differential privacy and secure aggregation) are propagated to Trainer/Aggregator by putting
privacy_levelin the training job definition.
5. Next action suggestion
- Decide on a priority list of the above artifacts (first three) and automate the learning pipeline with CI. Recommended first three:
vit-base16-embed-v1,wav2vec2-base-embed-v1,gpt-small-v1-lm-finetune-v1. - Add the above parameters to the frontend learning form (
frontend/pages/settings.pyetc.) and design an API to submit the learning job viaapi_client.
File save location: docs/DISTRIBUTED_BRAIN_MODEL_ARTIFACTS.md