EvoSpikeNet RAG System - Japanese Search Optimization Specification (Detailed)
Version: 3.1
Date: 2026-05-23
Status: Implemented with Operational Hardening
Document ID: RAG-JP-SPEC-V3.1-001
v3.1 Changes: Synchronized the API contract with the current implementation. Reflected RRF accumulation across query expansions, final-score reranking with memory_boost, and Unit / Integration / System / E2E contract tests.
Table of Contents
- System Overview
- Requirements Definition
- Technical Specification Details
- 3.5 EvoSpikeNet Core Memory System Integration
- API Specification
- 4.3 Memory-Integrated API Flow
- Search Algorithm
- 5.3 Memory-Enhanced Ranking
- Evaluation & Testing
- Deployment
- Troubleshooting
- Appendix
1. System Overview
1.1 Purpose
Improve Japanese search accuracy of the EvoSpikeNet-Core RAG system by addressing:
- Text Variations: Mutual search for "盆休み" and "お盆休み"
- Named Entities: Accurate search for project ID "EV-2024-001"
- Synonyms: Unified search for "会議", "ミーティング", "カンファレンス"
1.2 Scope
| Target | Included | Excluded |
|---|---|---|
| Language | Japanese (primary), English | Chinese, Korean |
| Search | Hybrid (BM25 + Vector) | Full-text only |
| Features | RAG Pipeline | Generative Model |
| Scale | 1M+ documents | Real-time Streaming |
1.3 Architecture Overview (v3.0: Memory-Augmented)
flowchart TD
Q["User Query<br/>(Japanese)"] --> DETECT["Language Detection"]
DETECT --> NER["NER/Entity Extraction"]
NER --> SM_Q["SemanticMemory<br/>.retrieve_semantic_knowledge()"]
SM_Q --> EP_ENR["EpisodicMemory<br/>.integrate_episodic_semantic()"]
EP_ENR --> NORM["Normalize & Expand"]
NORM --> ES["Elasticsearch<br/>BM25 + Entity Boost"]
NORM --> MV["Milvus<br/>Vector Search"]
ES --> RRF["RRF Fusion<br/>+ Boost"]
MV --> RRF
RRF --> EP_RET["EpisodicMemory<br/>.retrieve_memories()"]
EP_RET --> STORE["Reranking with<br/>Memory Scores"]
STORE --> TOP["Top-K Results"]
TOP --> RESP["Generate Response"]
RESP --> RATE["User Feedback<br/>(rating)"]
RATE --> LTM["LongTermMemoryModule<br/>.store_spike_episode()"]
style SM_Q fill:#e1f5ff
style EP_ENR fill:#e1f5ff
style EP_RET fill:#e1f5ff
style LTM fill:#f3e5f5
style STORE fill:#fff3e0
1.4 2026-05-23 Implementation Status and Operational Notes
The core scope of this specification is already implemented, primarily in EvoSpikeNet-Core/evospikenet/api_modules/rag_v2_api.py and EvoSpikeNet-Core/evospikenet/rag_memory_integrator.py. The points below summarize the current runtime behavior.
- Implemented APIs:
POST /api/v2/rag/searchPOST /api/v2/rag/feedbackGET /api/v2/rag/preprocessing/health
- Preprocessing responsibilities are explicitly separated into:
SudachiTokenizerEntityRecognizerQueryExpander
- Production-like policy:
- In
production/staging, unresolved dependencies are treated as fail-closed and the system does not fall back to placeholder embeddings or synthetic responses.
- In
- Embedding policy:
- A real
SentenceTransformerencoder is required and the embedding dimension is unified at 384.
- A real
- QueryExpander:
- Supports
rule|llm|hybridbackends. - LLM output is accepted only as JSON in the form
{"expansions": [...]}.
- Supports
- Quality guard:
- Evaluates diversity and redundancy and automatically falls back to the
rulebackend when expansion quality is low.
- Evaluates diversity and redundancy and automatically falls back to the
- Observability:
debug_info.preprocessingexposes quality metrics, guard statistics, history, andquery_hashsummaries.- Startup warmup and the health endpoint are used to verify preprocessing readiness.
- Main operational settings:
RAG_V2_NER_*RAG_V2_PREPROCESSING_*RAG_V2_SUDACHI_*RAG_V2_QUERY_EXPANDER_*
- Future expansion items:
- Context-aware RRF
- NER fine-tuning
- Automated feedback optimization
- Multilingual rollout
- See RAG_REMAINING_FUNCTIONALITY.en.md for roadmap details.
- 2026-05-23 verification update:
- Search results from query expansions are merged by adding RRF scores per
doc_id. memory_boostis added to the final score, and result ranks are assigned after final-score sorting.POST /api/v2/rag/feedbackrequiressession_idand returnsmemory_idplusimportanceafter storage.
- Search results from query expansions are merged by adding RRF scores per
2. Requirements Definition
2.1 Functional Requirements
FR-1: Hybrid Search
Requirement: Integrate BM25 keyword search and vector search
Query latency: < 500ms @ top-10
Accuracy (NDCG@10): > 0.75 at production
Throughput: 100 queries/sec
Implementation: RRF (k=60) score integration
FR-2: Japanese-Specific Processing
Requirement: Handle Japanese text variations and synonyms
Variation Coverage: > 80%
Entity Recognition Recall: > 80%
Variation normalization overhead: < 5ms
Implementation: - Sudachi tokenization - NER entity extraction - Query Expansion for synonyms
FR-3: Evaluation Framework
Requirement: Regular evaluation of search accuracy
Test cases: >= 50
Evaluation metrics: MRR, NDCG, Recall, Precision
Automated evaluation script: Implemented
2.2 Non-Functional Requirements
NFR-1: Performance
| Metric | Requirement | Measurement |
|---|---|---|
| P95 Latency | < 300ms | query latency monitor |
| P99 Latency | < 800ms | query latency monitor |
| Throughput | >= 100 q/s | load testing |
| Index Rebuild | < 4h | full rebuild time |
NFR-2: Availability
Availability: >= 99.5%
MTTR (Mean Time To Recovery): < 5 min
Backup frequency: Daily
NFR-3: Scalability
Document Scaling: Up to 10M docs
Language Support: Extensible to 3+ languages
Customization: External rule/dictionary management
NFR-4: Security
Access Control: Role-based (RBAC)
Data Encryption: in-transit (TLS) and at-rest
Audit Logging: All query access
3. Technical Specification Details
3.1 - 3.4 [Previous sections: Language Detection, Tokenization, NER, Query Expansion remain unchanged]
3.5 EvoSpikeNet Core Memory System Integration
v3.0 integrates three memory modules from EvoSpikeNet-Core SDK to enhance RAG context enrichment and feedback loops.
3.5.1 Core Components Overview
| Component | Module | Purpose |
|---|---|---|
SemanticMemory |
episodic_memory.py |
Entity concept knowledge graph + semantic enrichment |
EpisodicMemory |
episodic_memory.py |
Search session storage & retrieval |
LongTermMemoryModule |
long_term_memory.py |
SNN gating + importance scoring + forgetting |
RAGSemanticLoader |
New | Load domain vocabulary into SemanticMemory |
RAGEpisodicRecorder |
New | Record search sessions & user feedback |
RAGLongTermIntegrator |
New | SNN-gated storage & compression |
RAGMemoryIntegrator |
New | Façade orchestrating all three |
3.5.2 Integration Sequence
sequenceDiagram
actor User
participant RAG_API
participant RAGMemoryIntegrator
participant SemanticMemory
participant EpisodicMemory
participant LongTermMemoryModule
User->>RAG_API: Search query + context
RAG_API->>RAGMemoryIntegrator: enrich_query_context()
RAGMemoryIntegrator->>SemanticMemory: retrieve_semantic_knowledge(entities)
SemanticMemory-->>RAGMemoryIntegrator: [related concepts + embeddings]
RAGMemoryIntegrator->>EpisodicMemory: integrate_episodic_semantic()
EpisodicMemory-->>RAGMemoryIntegrator: enriched_context
RAGMemoryIntegrator->>EpisodicMemory: retrieve_memories(query, top_k=3)
EpisodicMemory-->>RAGMemoryIntegrator: [past sessions + scores]
RAGMemoryIntegrator-->>RAG_API: enhanced_query + memory_boost
RAG_API-->>User: Top-K results + memory_context
User->>RAG_API: Provide rating (1-5)
RAG_API->>RAGMemoryIntegrator: record_feedback(session_id, rating)
RAGMemoryIntegrator->>EpisodicMemory: store_experience(reward=rating/5)
EpisodicMemory->>LongTermMemoryModule: store_spike_episode(importance)
LongTermMemoryModule-->>EpisodicMemory: memory_id
3.5.3 SemanticMemory Specification
Purpose: Maintain concept knowledge graph for entity enrichment.
Field Mapping:
| SemanticMemoryEntry Field | RAG Usage |
|---|---|
concept |
Entity (PROJECT_ID, PRODUCT, PERSON, ORG, DATE) |
definition |
Domain description from knowledge base |
related_concepts |
Synonyms / related entities |
concept_embedding |
384-dim MiniLM vector |
confidence |
0.7 - 1.0 (knowledge certainty) |
Integration: RAGSemanticLoader class
class RAGSemanticLoader:
"""
Load domain vocabulary and entity knowledge into SemanticMemory.
"""
def load_domain_vocabulary(
self,
vocabulary_file: str,
embedding_model,
) -> int:
"""
Load project entities (EV-2024-001, etc.) and domain terms.
Returns: count of loaded concepts.
"""
pass
def enrich_entity_boost(
self,
entity_id: str,
retrieved_concepts: List[Tuple[SemanticMemoryEntry, float]],
) -> float:
"""
Compute dynamic entity boost from semantic relatedness.
Returns: boost factor (1.0 - 10.0).
"""
pass
3.5.4 EpisodicMemory Specification
Purpose: Store search sessions and retrieve similar past sessions for reranking.
Field Mapping:
| EpisodicMemoryEntry Field | RAG Session Data |
|---|---|
context |
{query, user_id, timestamp, extracted_entities} |
action |
Search pipeline (ES + Milvus) |
outcome |
Retrieved doc_ids (top-10) |
reward |
User rating / 5 |
context_embedding |
Query 384-dim MiniLM vector |
Integration: RAGEpisodicRecorder class
class RAGEpisodicRecorder:
"""
Record search sessions and retrieve similar past sessions.
"""
def record_session(
self,
query: str,
entity_extracted: Dict,
doc_ids_retrieved: List[str],
ranking_scores: List[float],
query_embedding: np.ndarray, # 384-dim
) -> str:
"""
Record search session to episodic memory.
Returns: session_id for later feedback linkage.
"""
pass
def retrieve_similar_sessions(
self,
query: str,
query_embedding: np.ndarray,
top_k: int = 3,
) -> List[Dict[str, Any]]:
"""
Retrieve past sessions similar to current query.
Returns: [{doc_ids, reward, score}, ...] for reranking.
"""
pass
3.5.5 LongTermMemoryModule Specification
Purpose: Importance gating via SNN + automatic consolidation/forgetting.
Key Features: - Spiking Neural Network (SNN) gating of episodic memories - RecurrentRetentionCircuit (GRU-based) for temporal decay - ForgettingController: forget threshold = 0.3, consolidation threshold = 0.7
SNN Gating Logic:
flowchart TD
EM["EpisodicMemory Entry<br/>(reward, timestamp)"] --> SPIKE{"reward > 0.6?
(Spike Fire)"}
SPIKE -->|YES| WEIGHT["Weight += learning_rate<br/>× (reward - baseline)"]
SPIKE -->|NO| DECAY["Weight *= decay_rate<br/>(temporal decay)"]
WEIGHT --> STORE["Store to LongTermMemory<br/>(high importance)"]
DECAY --> FORGET{"importance < 0.3?"}
FORGET -->|YES| DROP["Forget"]
FORGET -->|NO| STORE
Integration: RAGLongTermIntegrator class
class RAGLongTermIntegrator:
"""
SNN-gated storage and compression for long-term memory.
"""
def store_with_snn_gating(
self,
episodic_entry,
spike_sequence: np.ndarray, # SNN firing pattern
reward: float, # User rating / 5
) -> str:
"""
Store episodic entry with SNN importance gating.
Returns: memory_id for retrieval.
"""
pass
# Forgetting & Compression Policy
FORGETTING_POLICY = {
"forget_threshold": 0.3, # Drop if importance < 0.3
"consolidation_threshold": 0.7, # Compress if importance > 0.7
"max_memories": 10000, # Enforce storage limit
"temporal_decay_per_day": 0.05, # 5% importance decay per day
}
3.5.6 Embedding Dimension Unification
All memory embeddings must use 384-dim (MiniLM paraphrase-multilingual-MiniLM-L12-v2).
| Module | Embedding Size | Adjustment |
|---|---|---|
| Query embedding (MiniLM) | 384-dim | ✓ Native |
| SemanticMemory (default) | 512-dim | → Override to 384-dim |
| EpisodicMemory (default) | 512-dim | → Override to 384-dim |
| LongTermMemoryModule (default) | 512-dim | → Override state_dim=384, embedding_dim=384 |
| Milvus vector DB | 384-dim | ✓ Matching |
Initialization Code:
from evospikenet.episodic_memory import EpisodicMemory
from evospikenet.long_term_memory import LongTermMemoryModule
# Override default 512-dim → 384-dim
epic_mem = EpisodicMemory(embedding_dim=384, max_memories=10000)
ltm = LongTermMemoryModule(state_dim=384, embedding_dim=384)
3.5.7 RAGMemoryIntegrator Façade
Purpose: Single entry point orchestrating all three memory modules.
Class Diagram:
classDiagram
class RAGMemoryIntegrator {
-semantic_mem: SemanticMemory
-episodic_mem: EpisodicMemory
-ltm: LongTermMemoryModule
-embedding_model
+from_config(config_file): RAGMemoryIntegrator
+enrich_query_context(query, entities) Dict
+record_search_session(session_data) str
+record_feedback(session_id, rating) None
+retrieve_memory_boost(query, top_k) float
}
RAGMemoryIntegrator --> SemanticMemory
RAGMemoryIntegrator --> EpisodicMemory
RAGMemoryIntegrator --> LongTermMemoryModule
Key Methods:
class RAGMemoryIntegrator:
@staticmethod
def from_config(config_file: str):
"""Initialize all three memory modules from config."""
pass
def enrich_query_context(
self,
query: str,
extracted_entities: Dict[str, List[str]],
) -> Dict[str, Any]:
"""
Enrich query with semantic knowledge and past session context.
Returns:
{
"semantic_concepts": ["EV-2024-001", ...],
"semantic_knowledge": [...],
"past_session_boost": 0.15,
"memory_context": {...}
}
"""
pass
def record_search_session(
self,
query: str,
doc_ids: List[str],
ranking_scores: List[float],
user_id: str = None,
) -> str:
"""
Record search session to episodic memory.
Returns: session_id for feedback linkage.
"""
pass
def record_feedback(
self,
session_id: str,
rating: int, # 1-5
) -> Dict[str, Any]:
"""
Record user feedback (rating) and trigger LTM storage.
Returns: {memory_id, importance, consolidation_status}
"""
pass
---
## 4. API Specification
### 4.1 Search API
**Endpoint**: `POST /api/v2/rag/search`
**Request:**
```json
{
"query": "Project EV-2024-001 progress report",
"top_k": 5,
"language": "auto",
"enable_query_expansion": true,
"enable_entity_boosting": true,
"return_debug_info": false
}
Response:
{
"status": "success",
"query": "Project EV-2024-001 progress report",
"language": "en",
"results": [
{
"id": "DOC_001",
"rank": 1,
"score": 0.8234,
"source": "rrf",
"text": "Project EV-2024-001 is...",
"metadata": {
"project_id": "EV-2024-001",
"updated_at": "2026-05-20"
}
}
],
"execution_time_ms": 123,
"debug_info": null
}
Error Response:
{
"status": "error",
"code": "INVALID_QUERY",
"message": "Query is too short (min 2 characters)",
"details": {}
}
4.2 Feedback API
Endpoint: POST /api/v2/rag/feedback
Request:
{
"session_id": "rag_20260520_143022_001",
"query": "Project EV-2024-001 progress report",
"doc_id": "DOC_001",
"rating": 5,
"comment": "Exactly what I expected",
"user_id": "user_123"
}
Response:
{
"status": "success",
"feedback_id": "fb_12345",
"memory_id": "ep_20260520_143155",
"importance": 0.72,
"recorded_at": "2026-05-20T10:30:00Z"
}
4.3 Memory-Integrated API Flow
The current implementation integrates memory in-process through RAGMemoryIntegrator; the /api/memory/* endpoints remain available for external memory-management clients. The RAG v2 fast path does not call those REST endpoints.
Sequence Diagram (v3.1: End-to-end with in-process memory integration):
sequenceDiagram
participant CLI as Client
participant RAG_API as RAG API
participant INTEG as RAGMemoryIntegrator
participant EM as EpisodicMemory
participant LTM as LongTermMemoryModule
CLI->>RAG_API: POST /api/v2/rag/search {query, top_k}
RAG_API->>INTEG: enrich_query_context(query_context)
INTEG->>EM: integrate_episodic_semantic() and retrieve_memories(top_k=3)
EM-->>INTEG: semantic_concepts + past_sessions
INTEG-->>RAG_API: enriched_boost + past_sessions
RAG_API->>RAG_API: Retrieve each expansion and add RRF by doc_id
RAG_API->>INTEG: compute_memory_boost(doc_id, past_sessions)
INTEG-->>RAG_API: memory_boost
RAG_API->>RAG_API: Sort by final_score and assign rank
RAG_API-->>CLI: 200 OK {results, memory_context, session_id}
CLI->>RAG_API: POST /api/v2/rag/feedback {session_id, rating}
RAG_API->>INTEG: record_feedback(session_id, rating)
INTEG->>LTM: store_with_snn_gating(reward=rating/5)
LTM-->>INTEG: RetentionSummary(memory_id, importance)
INTEG-->>RAG_API: memory_id + importance
RAG_API-->>CLI: 200 OK {feedback_id, memory_id, importance}
API Endpoint Integration Matrix:
| RAG phase | Implementation call | Purpose |
|---|---|---|
| Query enrichment | RAGMemoryIntegrator.enrich_query_context() |
Fetch semantic concepts and past sessions |
| Expansion aggregation | RRF accumulation in rag_v2_api |
Add score when the same doc_id appears in multiple expansions |
| Memory reranking | RAGMemoryIntegrator.compute_memory_boost() |
Add memory_boost into final_score |
| Feedback storage | RAGMemoryIntegrator.record_feedback() |
Store reward = rating/5 through the long-term memory path |
| External memory management | /api/memory/* |
REST API for non-RAG clients and operations |
Extended Response Format:
{
"status": "success",
"query": "Project EV-2024-001 progress report",
"language": "en",
"session_id": "rag_20260520_143022_001",
"results": [
{
"id": "DOC_A",
"rank": 1,
"score": 0.259,
"memory_boost": 0.146,
"source": "rrf+memory",
"text": "Project EV-2024-001 is...",
"metadata": {"project_id": "EV-2024-001"}
}
],
"memory_context": {
"semantic_concepts": ["EV-2024-001", "Q-PFC Loop"],
"past_sessions_used": 2,
"entity_boost_enriched": {"EV-2024-001": 7.05}
},
"execution_time_ms": 187
}
Feedback Response:
{
"status": "success",
"feedback_id": "fb_12345",
"memory_id": "ep_20260520_143155",
"importance": 0.72,
"recorded_at": "2026-05-20T14:31:55Z"
}
5. Search Algorithm
5.1 Language-Specific Search Strategy
Japanese Query Case (v3.0: Memory-Enhanced)\n\n\nQuery: "Project EV-2024-001 progress report"\n ↓\n[Normalize] → "Project EV-2024-001 progress report"\n ↓\n[Tokenize with Sudachi] → ["Project", "EV-2024-001", "progress", "report"]\n ↓\n[Entity Extract] → {PROJECT_ID: ["EV-2024-001"]}\n ↓\n[SemanticMemory.retrieve_semantic_knowledge("EV-2024-001", top_k=5)]\n → Related concepts: [("Q-PFC Loop", 0.82), ("AEG Phase4", 0.71)]\n → Boost enrichment: PROJECT_ID = 5.0 × (1 + 0.82 × 0.5) = 7.05\n ↓\n[EpisodicMemory.integrate_episodic_semantic(query_context)]\n → semantic_concepts: ["EV-2024-001", "Q-PFC Loop"]\n → semantic_knowledge: ["EV-2024-001 is Q-PFC control integration in Phase4"]\n ↓\n[Query Expansion (Optional)]\n - Dictionary + LLM: ["EV-2024-001 progress", "Phase 4 status report"]\n ↓\n[Dual Search]\n - Elasticsearch: BM25 (Sudachi + enriched Boost 7.05)\n - Milvus: L2 384-dim vector search\n ↓\n[RRF Fusion + Entity Boost (7.05)] → Ranking\n ↓\n[EpisodicMemory.retrieve_memories(query_context, top_k=3)]\n → Past sessions: [{doc_ids:["DOC_A","DOC_B"], reward:0.8, score:0.91}, ...]\n → Add past high-rated documents scores with +memory_boost\n ↓\n[Top-5 Documents] → Response generation\n ↓\n[User rating: rating=4] → reward = 0.8\n ↓\n[LongTermMemoryModule.store_spike_episode(spike_seq, context, reward=0.8)]\n → RecurrentRetentionCircuit (GRU) → importance = 0.72\n → EpisodicMemory.store_experience(importance=0.72)\n ↓\n[Store to long-term memory: memory_id="epi_20260520_001"] (used in future similar queries)\n
English Query Case
Query: "project report"
↓
[Language Detect] → "en"
↓
[Normalize] → No Japanese processing
↓
[Tokenize] → ["project", "report"]
↓
[Dual Search with English Analyzer]
- Elasticsearch: Standard analyzer
- Milvus: Same as above
↓
[RRF Fusion] → Merged ranking
5.2 Ranking Algorithm
Score Computation:
rrf_score(doc) = Σ 1 / (60 + rank_i)
base_score = rrf_score * entity_multiplier
Where:
rank_i is the rank from each BM25 / vector / query-expansion path
entity_multiplier is the maximum SemanticMemory-enriched entity boost
5.3 Memory-Enhanced Ranking
v3.0 adds a memory layer to RRF scoring for dynamic boosting of previously high-rated documents.
Scoring Formula (v3.0)
final_score = rrf_score * entity_boost_enriched + memory_boost
| Variable | Description | Range |
|---|---|---|
rrf_score |
RRF fusion score (k=60) | 0 to 1 |
entity_boost_enriched |
Boost factor after SemanticMemory enrichment | 1.0 to 10.0 |
memory_boost |
Score correction from past sessions; alpha=0.2 is already applied internally |
0 to 0.3 |
memory_boost Calculation
def compute_memory_boost(
doc_id: str,
past_sessions: List[Dict[str, Any]],
alpha: float = 0.2,
) -> float:
"""
Add score from high-rated documents in past similar sessions.
Args:
doc_id: Target document ID
past_sessions: Results from retrieve_similar_sessions()
alpha: Memory score weight
Returns:
memory_boost: Additive score (0.0 to 0.3)
"""
boost = 0.0
for session in past_sessions:
if doc_id in session.get("doc_ids", []):
# Weight by past rating (reward) × similarity (score)
boost += session["reward"] * session["score"] * alpha
return min(boost, 0.3) # Cap at 0.3 to prevent over-amplification
Memory-Enhanced Ranking Flow
flowchart TD
RRF_SCORES["RRF Scores\n{DOC_A: 0.016, DOC_B: 0.015, DOC_C: 0.012}"]
SM_BOOST["SemanticMemory\nenriched_boost\n{EV-2024-001: 7.05}"]
EP_SESSIONS["EpisodicMemory\npast_sessions\n[{doc_ids:[DOC_A,DOC_B], reward:0.8, score:0.91}]"]
RRF_SCORES --> APPLY_BOOST["Apply entity_boost\nDOC_A: 0.016 × 7.05 = 0.113"]
SM_BOOST --> APPLY_BOOST
APPLY_BOOST --> APPLY_MEM["Add memory_boost\nDOC_A: 0.113 + (0.8×0.91×0.2) = 0.259\nDOC_B: 0.106 + (0.8×0.91×0.2) = 0.252"]
EP_SESSIONS --> APPLY_MEM
APPLY_MEM --> FINAL["Final Ranking\n1st: DOC_A (0.259)\n2nd: DOC_B (0.252)\n3rd: DOC_C (0.085)"]
6. Evaluation & Testing
6.1 Test Cases
Test Set Composition:
| Category | Cases | Examples |
|---|---|---|
| Text Variations | 15 | "盆休み", "お盆休み" |
| Named Entities | 30+ | "EV-2024-001", "John Doe" |
| Total | 50+ | - |
6.2 Evaluation Metrics
| Metric | Target | Definition |
|---|---|---|
| MRR | > 0.7 | Mean Reciprocal Rank for variations |
| Recall | > 0.8 | Entity recall rate |
| Precision | > 0.8 | Entity precision rate |
| F1 Score | > 0.75 | Harmonic mean of recall/precision |
7. Deployment
7.1 Environment Requirements
Server:
CPU: 8 cores
Memory: 32GB
Storage: 500GB
GPU: Optional (CUDA 11.8+)
Software:
Python: 3.10+
Elasticsearch: 8.0+
Milvus: 2.3+
7.2 Installation
pip install -r requirements.txt
docker build -t rag-v2:latest .
docker push registry/rag-v2:latest
8. Troubleshooting
Q: MRR < 0.5
Solution: Upgrade Sudachi, adjust weights, enable Query Expansion
Q: Entity Recall < 0.8
Solution: Verify NER model, check Elasticsearch mapping, increase entity boost
Q: Query latency > 500ms
Solution: Disable Query Expansion, reduce Milvus probe count
9. Appendix
9.1 Dependencies List
RAG Pipeline Dependencies:
sudachipy==0.7.5
sudachipy-dict-small==20240716
tner==1.8.0
elasticsearch==8.6.2
pymilvus==2.4.1
sentence-transformers==2.2.2
transformers==4.34.0
pandas==2.0.0
numpy==1.23.0
pyyaml==6.0
torch>=2.0.0
EvoSpikeNet Core SDK Dependencies (v3.0 Added):
evospikenet/episodic_memory.py # EpisodicMemory, SemanticMemoryEntry
evospikenet/long_term_memory.py # LongTermMemoryModule
evospikenet/forgetting_controller.py # ForgettingController
evospikenet/snn_memory_extension.py # LargeScaleSpikeReservoir
evospikenet/api_modules/memory_api.py # REST API endpoints
Configuration Files (New):
config/memory_config.yaml # RAGMemoryIntegrator configuration
config/domain_terms.yaml # Domain vocabulary for SemanticMemory loading
9.2 Related Documentation
- RAG_SYSTEM_DETAILED.md - Detailed RAG system design
- RAG_REMAINING_FUNCTIONALITY.md - Future phases (Phase 6-11)
- RAG_JAPANESE_EVALUATE_GUIDE.md - Evaluation guide
- EPISODIC_MEMORY_IMPLEMENTATION.md - EvoSpikeNet episodic memory implementation details
Core SDK Source References:
- evospikenet/episodic_memory.py - EpisodicMemory, SemanticMemoryEntry, EpisodicMemoryEntry classes
- evospikenet/long_term_memory.py - LongTermMemoryModule, RetentionSummary classes
- evospikenet/api_modules/memory_api.py - REST endpoint implementations
Document Version: 3.0
Last Updated: 2026-05-22
Approved By: RAG Development Team
Status: Implemented with Operational Hardening
v3.0 Changes: EvoSpikeNet Core memory system (SemanticMemory / EpisodicMemory / LongTermMemoryModule) integration added. Transition to Memory-Augmented RAG architecture.
2026-05-22 Update: Reflected the RAG v2 preprocessing health endpoint, fail-closed runtime policy, QueryExpander quality guards, and observability updates.