BRain Language Vocab README

Brain Language Vocabulary — Setup & Operating Instructions

Purpose: Prefill brain_language large token IDs (e.g. 10000, 30000, etc.) per environment and load them at startup to ensure decoder stability.
Placement: Place the sample file in config/brain_language_vocab.{environment}.json.
Examples: config/brain_language_vocab.development.json, config/brain_language_vocab.staging.json, config/brain_language_vocab.production.json
Reference from settings (settings.yaml):
Merge config/settings.yaml and config/settings.{env}.yaml and check the following keys:

brain_language:
  vocab_file: brain_language_vocab.development.json

vocab_file can be a relative path (under config/) or an absolute path.
If not specified, config/brain_language_vocab.json will be loaded by default.
Startup behavior:
The module evospikenet.eeg_integration.brain_language_decoder checks the above settings during import, reads the specified JSON, and registers the vocabulary in the shared decoder _decoder_instance.
JSON keys can be strings or numbers, but internally they are converted to integer token IDs.
Operating procedure (example):
Create environment JSON: config/brain_language_vocab.production.json
Set brain_language.vocab_file in config/settings.production.yaml
Automatically loaded when starting the app (or running a test)
Test:
Unit test: Run pytest tests/unit/eeg_integration (recommended in a container).
Note:
Loading at startup is a best effort and does not throw an exception even if it fails, so check the log to make sure the intended file is being loaded.
If you want to update the decoder vocabulary, you can add it at runtime using BrainLanguageDecoder.update_vocab().

Please let me know if you have any questions about file placement or operation. The added README can be found in docs/BRain_Language_Vocab_README.md.