Skip to content

EvoSpikeNet RAG System - Japanese Search Optimization Specification (Detailed)

Version: 3.1
Date: 2026-05-23
Status: Implemented with Operational Hardening
Document ID: RAG-JP-SPEC-V3.1-001
v3.1 Changes: Synchronized the API contract with the current implementation. Reflected RRF accumulation across query expansions, final-score reranking with memory_boost, and Unit / Integration / System / E2E contract tests.


Table of Contents

  1. System Overview
  2. Requirements Definition
  3. Technical Specification Details
  4. 3.5 EvoSpikeNet Core Memory System Integration
  5. API Specification
  6. 4.3 Memory-Integrated API Flow
  7. Search Algorithm
  8. 5.3 Memory-Enhanced Ranking
  9. Evaluation & Testing
  10. Deployment
  11. Troubleshooting
  12. Appendix

1. System Overview

1.1 Purpose

Improve Japanese search accuracy of the EvoSpikeNet-Core RAG system by addressing:

  • Text Variations: Mutual search for "盆休み" and "お盆休み"
  • Named Entities: Accurate search for project ID "EV-2024-001"
  • Synonyms: Unified search for "会議", "ミーティング", "カンファレンス"

1.2 Scope

Target Included Excluded
Language Japanese (primary), English Chinese, Korean
Search Hybrid (BM25 + Vector) Full-text only
Features RAG Pipeline Generative Model
Scale 1M+ documents Real-time Streaming

1.3 Architecture Overview (v3.0: Memory-Augmented)

flowchart TD
    Q["User Query<br/>(Japanese)"] --> DETECT["Language Detection"]
    DETECT --> NER["NER/Entity Extraction"]
    NER --> SM_Q["SemanticMemory<br/>.retrieve_semantic_knowledge()"]
    SM_Q --> EP_ENR["EpisodicMemory<br/>.integrate_episodic_semantic()"]

    EP_ENR --> NORM["Normalize & Expand"]
    NORM --> ES["Elasticsearch<br/>BM25 + Entity Boost"]
    NORM --> MV["Milvus<br/>Vector Search"]

    ES --> RRF["RRF Fusion<br/>+ Boost"]
    MV --> RRF

    RRF --> EP_RET["EpisodicMemory<br/>.retrieve_memories()"]
    EP_RET --> STORE["Reranking with<br/>Memory Scores"]

    STORE --> TOP["Top-K Results"]
    TOP --> RESP["Generate Response"]

    RESP --> RATE["User Feedback<br/>(rating)"]
    RATE --> LTM["LongTermMemoryModule<br/>.store_spike_episode()"]

    style SM_Q fill:#e1f5ff
    style EP_ENR fill:#e1f5ff
    style EP_RET fill:#e1f5ff
    style LTM fill:#f3e5f5
    style STORE fill:#fff3e0

1.4 2026-05-23 Implementation Status and Operational Notes

The core scope of this specification is already implemented, primarily in EvoSpikeNet-Core/evospikenet/api_modules/rag_v2_api.py and EvoSpikeNet-Core/evospikenet/rag_memory_integrator.py. The points below summarize the current runtime behavior.

  • Implemented APIs:
    • POST /api/v2/rag/search
    • POST /api/v2/rag/feedback
    • GET /api/v2/rag/preprocessing/health
  • Preprocessing responsibilities are explicitly separated into:
    • SudachiTokenizer
    • EntityRecognizer
    • QueryExpander
  • Production-like policy:
    • In production/staging, unresolved dependencies are treated as fail-closed and the system does not fall back to placeholder embeddings or synthetic responses.
  • Embedding policy:
    • A real SentenceTransformer encoder is required and the embedding dimension is unified at 384.
  • QueryExpander:
    • Supports rule|llm|hybrid backends.
    • LLM output is accepted only as JSON in the form {"expansions": [...]}.
  • Quality guard:
    • Evaluates diversity and redundancy and automatically falls back to the rule backend when expansion quality is low.
  • Observability:
    • debug_info.preprocessing exposes quality metrics, guard statistics, history, and query_hash summaries.
    • Startup warmup and the health endpoint are used to verify preprocessing readiness.
  • Main operational settings:
    • RAG_V2_NER_*
    • RAG_V2_PREPROCESSING_*
    • RAG_V2_SUDACHI_*
    • RAG_V2_QUERY_EXPANDER_*
  • Future expansion items:
  • 2026-05-23 verification update:
    • Search results from query expansions are merged by adding RRF scores per doc_id.
    • memory_boost is added to the final score, and result ranks are assigned after final-score sorting.
    • POST /api/v2/rag/feedback requires session_id and returns memory_id plus importance after storage.

2. Requirements Definition

2.1 Functional Requirements

Requirement: Integrate BM25 keyword search and vector search

Query latency: < 500ms @ top-10
Accuracy (NDCG@10): > 0.75 at production
Throughput: 100 queries/sec

Implementation: RRF (k=60) score integration

FR-2: Japanese-Specific Processing

Requirement: Handle Japanese text variations and synonyms

Variation Coverage: > 80%
Entity Recognition Recall: > 80%
Variation normalization overhead: < 5ms

Implementation: - Sudachi tokenization - NER entity extraction - Query Expansion for synonyms

FR-3: Evaluation Framework

Requirement: Regular evaluation of search accuracy

Test cases: >= 50
Evaluation metrics: MRR, NDCG, Recall, Precision
Automated evaluation script: Implemented

2.2 Non-Functional Requirements

NFR-1: Performance

Metric Requirement Measurement
P95 Latency < 300ms query latency monitor
P99 Latency < 800ms query latency monitor
Throughput >= 100 q/s load testing
Index Rebuild < 4h full rebuild time

NFR-2: Availability

Availability: >= 99.5%
MTTR (Mean Time To Recovery): < 5 min
Backup frequency: Daily

NFR-3: Scalability

Document Scaling: Up to 10M docs
Language Support: Extensible to 3+ languages
Customization: External rule/dictionary management

NFR-4: Security

Access Control: Role-based (RBAC)
Data Encryption: in-transit (TLS) and at-rest
Audit Logging: All query access

3. Technical Specification Details

3.1 - 3.4 [Previous sections: Language Detection, Tokenization, NER, Query Expansion remain unchanged]


3.5 EvoSpikeNet Core Memory System Integration

v3.0 integrates three memory modules from EvoSpikeNet-Core SDK to enhance RAG context enrichment and feedback loops.

3.5.1 Core Components Overview

Component Module Purpose
SemanticMemory episodic_memory.py Entity concept knowledge graph + semantic enrichment
EpisodicMemory episodic_memory.py Search session storage & retrieval
LongTermMemoryModule long_term_memory.py SNN gating + importance scoring + forgetting
RAGSemanticLoader New Load domain vocabulary into SemanticMemory
RAGEpisodicRecorder New Record search sessions & user feedback
RAGLongTermIntegrator New SNN-gated storage & compression
RAGMemoryIntegrator New Façade orchestrating all three

3.5.2 Integration Sequence

sequenceDiagram
    actor User
    participant RAG_API
    participant RAGMemoryIntegrator
    participant SemanticMemory
    participant EpisodicMemory
    participant LongTermMemoryModule

    User->>RAG_API: Search query + context
    RAG_API->>RAGMemoryIntegrator: enrich_query_context()
    RAGMemoryIntegrator->>SemanticMemory: retrieve_semantic_knowledge(entities)
    SemanticMemory-->>RAGMemoryIntegrator: [related concepts + embeddings]
    RAGMemoryIntegrator->>EpisodicMemory: integrate_episodic_semantic()
    EpisodicMemory-->>RAGMemoryIntegrator: enriched_context
    RAGMemoryIntegrator->>EpisodicMemory: retrieve_memories(query, top_k=3)
    EpisodicMemory-->>RAGMemoryIntegrator: [past sessions + scores]
    RAGMemoryIntegrator-->>RAG_API: enhanced_query + memory_boost
    RAG_API-->>User: Top-K results + memory_context

    User->>RAG_API: Provide rating (1-5)
    RAG_API->>RAGMemoryIntegrator: record_feedback(session_id, rating)
    RAGMemoryIntegrator->>EpisodicMemory: store_experience(reward=rating/5)
    EpisodicMemory->>LongTermMemoryModule: store_spike_episode(importance)
    LongTermMemoryModule-->>EpisodicMemory: memory_id

3.5.3 SemanticMemory Specification

Purpose: Maintain concept knowledge graph for entity enrichment.

Field Mapping:

SemanticMemoryEntry Field RAG Usage
concept Entity (PROJECT_ID, PRODUCT, PERSON, ORG, DATE)
definition Domain description from knowledge base
related_concepts Synonyms / related entities
concept_embedding 384-dim MiniLM vector
confidence 0.7 - 1.0 (knowledge certainty)

Integration: RAGSemanticLoader class

class RAGSemanticLoader:
    """
    Load domain vocabulary and entity knowledge into SemanticMemory.
    """

    def load_domain_vocabulary(
        self,
        vocabulary_file: str,
        embedding_model,
    ) -> int:
        """
        Load project entities (EV-2024-001, etc.) and domain terms.
        Returns: count of loaded concepts.
        """
        pass

    def enrich_entity_boost(
        self,
        entity_id: str,
        retrieved_concepts: List[Tuple[SemanticMemoryEntry, float]],
    ) -> float:
        """
        Compute dynamic entity boost from semantic relatedness.
        Returns: boost factor (1.0 - 10.0).
        """
        pass

3.5.4 EpisodicMemory Specification

Purpose: Store search sessions and retrieve similar past sessions for reranking.

Field Mapping:

EpisodicMemoryEntry Field RAG Session Data
context {query, user_id, timestamp, extracted_entities}
action Search pipeline (ES + Milvus)
outcome Retrieved doc_ids (top-10)
reward User rating / 5
context_embedding Query 384-dim MiniLM vector

Integration: RAGEpisodicRecorder class

class RAGEpisodicRecorder:
    """
    Record search sessions and retrieve similar past sessions.
    """

    def record_session(
        self,
        query: str,
        entity_extracted: Dict,
        doc_ids_retrieved: List[str],
        ranking_scores: List[float],
        query_embedding: np.ndarray,  # 384-dim
    ) -> str:
        """
        Record search session to episodic memory.
        Returns: session_id for later feedback linkage.
        """
        pass

    def retrieve_similar_sessions(
        self,
        query: str,
        query_embedding: np.ndarray,
        top_k: int = 3,
    ) -> List[Dict[str, Any]]:
        """
        Retrieve past sessions similar to current query.
        Returns: [{doc_ids, reward, score}, ...] for reranking.
        """
        pass

3.5.5 LongTermMemoryModule Specification

Purpose: Importance gating via SNN + automatic consolidation/forgetting.

Key Features: - Spiking Neural Network (SNN) gating of episodic memories - RecurrentRetentionCircuit (GRU-based) for temporal decay - ForgettingController: forget threshold = 0.3, consolidation threshold = 0.7

SNN Gating Logic:

flowchart TD
    EM["EpisodicMemory Entry<br/>(reward, timestamp)"] --> SPIKE{"reward > 0.6?
    (Spike Fire)"}     
    SPIKE -->|YES| WEIGHT["Weight += learning_rate<br/>× (reward - baseline)"]
    SPIKE -->|NO| DECAY["Weight *= decay_rate<br/>(temporal decay)"]
    WEIGHT --> STORE["Store to LongTermMemory<br/>(high importance)"]
    DECAY --> FORGET{"importance < 0.3?"}    
    FORGET -->|YES| DROP["Forget"]
    FORGET -->|NO| STORE

Integration: RAGLongTermIntegrator class

class RAGLongTermIntegrator:
    """
    SNN-gated storage and compression for long-term memory.
    """

    def store_with_snn_gating(
        self,
        episodic_entry,
        spike_sequence: np.ndarray,  # SNN firing pattern
        reward: float,  # User rating / 5
    ) -> str:
        """
        Store episodic entry with SNN importance gating.
        Returns: memory_id for retrieval.
        """
        pass


# Forgetting & Compression Policy
FORGETTING_POLICY = {
    "forget_threshold": 0.3,         # Drop if importance < 0.3
    "consolidation_threshold": 0.7,  # Compress if importance > 0.7
    "max_memories": 10000,           # Enforce storage limit
    "temporal_decay_per_day": 0.05,  # 5% importance decay per day
}

3.5.6 Embedding Dimension Unification

All memory embeddings must use 384-dim (MiniLM paraphrase-multilingual-MiniLM-L12-v2).

Module Embedding Size Adjustment
Query embedding (MiniLM) 384-dim ✓ Native
SemanticMemory (default) 512-dim → Override to 384-dim
EpisodicMemory (default) 512-dim → Override to 384-dim
LongTermMemoryModule (default) 512-dim → Override state_dim=384, embedding_dim=384
Milvus vector DB 384-dim ✓ Matching

Initialization Code:

from evospikenet.episodic_memory import EpisodicMemory
from evospikenet.long_term_memory import LongTermMemoryModule

# Override default 512-dim → 384-dim
epic_mem = EpisodicMemory(embedding_dim=384, max_memories=10000)
ltm = LongTermMemoryModule(state_dim=384, embedding_dim=384)

3.5.7 RAGMemoryIntegrator Façade

Purpose: Single entry point orchestrating all three memory modules.

Class Diagram:

classDiagram
    class RAGMemoryIntegrator {
        -semantic_mem: SemanticMemory
        -episodic_mem: EpisodicMemory
        -ltm: LongTermMemoryModule
        -embedding_model

        +from_config(config_file): RAGMemoryIntegrator
        +enrich_query_context(query, entities) Dict
        +record_search_session(session_data) str
        +record_feedback(session_id, rating) None
        +retrieve_memory_boost(query, top_k) float
    }

    RAGMemoryIntegrator --> SemanticMemory
    RAGMemoryIntegrator --> EpisodicMemory
    RAGMemoryIntegrator --> LongTermMemoryModule

Key Methods:

class RAGMemoryIntegrator:

    @staticmethod
    def from_config(config_file: str):
        """Initialize all three memory modules from config."""
        pass

    def enrich_query_context(
        self,
        query: str,
        extracted_entities: Dict[str, List[str]],
    ) -> Dict[str, Any]:
        """
        Enrich query with semantic knowledge and past session context.

        Returns:
            {
                "semantic_concepts": ["EV-2024-001", ...],
                "semantic_knowledge": [...],
                "past_session_boost": 0.15,
                "memory_context": {...}
            }
        """
        pass

    def record_search_session(
        self,
        query: str,
        doc_ids: List[str],
        ranking_scores: List[float],
        user_id: str = None,
    ) -> str:
        """
        Record search session to episodic memory.
        Returns: session_id for feedback linkage.
        """
        pass

    def record_feedback(
        self,
        session_id: str,
        rating: int,  # 1-5
    ) -> Dict[str, Any]:
        """
        Record user feedback (rating) and trigger LTM storage.
        Returns: {memory_id, importance, consolidation_status}
        """
        pass

---

## 4. API Specification

### 4.1 Search API

**Endpoint**: `POST /api/v2/rag/search`

**Request:**
```json
{
  "query": "Project EV-2024-001 progress report",
  "top_k": 5,
  "language": "auto",
  "enable_query_expansion": true,
  "enable_entity_boosting": true,
  "return_debug_info": false
}

Response:

{
  "status": "success",
  "query": "Project EV-2024-001 progress report",
  "language": "en",
  "results": [
    {
      "id": "DOC_001",
      "rank": 1,
      "score": 0.8234,
      "source": "rrf",
      "text": "Project EV-2024-001 is...",
      "metadata": {
        "project_id": "EV-2024-001",
        "updated_at": "2026-05-20"
      }
    }
  ],
  "execution_time_ms": 123,
  "debug_info": null
}

Error Response:

{
  "status": "error",
  "code": "INVALID_QUERY",
  "message": "Query is too short (min 2 characters)",
  "details": {}
}

4.2 Feedback API

Endpoint: POST /api/v2/rag/feedback

Request:

{
    "session_id": "rag_20260520_143022_001",
  "query": "Project EV-2024-001 progress report",
  "doc_id": "DOC_001",
  "rating": 5,
  "comment": "Exactly what I expected",
  "user_id": "user_123"
}

Response:

{
  "status": "success",
  "feedback_id": "fb_12345",
    "memory_id": "ep_20260520_143155",
    "importance": 0.72,
  "recorded_at": "2026-05-20T10:30:00Z"
}


4.3 Memory-Integrated API Flow

The current implementation integrates memory in-process through RAGMemoryIntegrator; the /api/memory/* endpoints remain available for external memory-management clients. The RAG v2 fast path does not call those REST endpoints.

Sequence Diagram (v3.1: End-to-end with in-process memory integration):

sequenceDiagram
    participant CLI as Client
    participant RAG_API as RAG API
    participant INTEG as RAGMemoryIntegrator
    participant EM as EpisodicMemory
    participant LTM as LongTermMemoryModule

    CLI->>RAG_API: POST /api/v2/rag/search {query, top_k}
    RAG_API->>INTEG: enrich_query_context(query_context)
    INTEG->>EM: integrate_episodic_semantic() and retrieve_memories(top_k=3)
    EM-->>INTEG: semantic_concepts + past_sessions
    INTEG-->>RAG_API: enriched_boost + past_sessions

    RAG_API->>RAG_API: Retrieve each expansion and add RRF by doc_id
    RAG_API->>INTEG: compute_memory_boost(doc_id, past_sessions)
    INTEG-->>RAG_API: memory_boost
    RAG_API->>RAG_API: Sort by final_score and assign rank

    RAG_API-->>CLI: 200 OK {results, memory_context, session_id}

    CLI->>RAG_API: POST /api/v2/rag/feedback {session_id, rating}
    RAG_API->>INTEG: record_feedback(session_id, rating)
    INTEG->>LTM: store_with_snn_gating(reward=rating/5)
    LTM-->>INTEG: RetentionSummary(memory_id, importance)
    INTEG-->>RAG_API: memory_id + importance
    RAG_API-->>CLI: 200 OK {feedback_id, memory_id, importance}

API Endpoint Integration Matrix:

RAG phase Implementation call Purpose
Query enrichment RAGMemoryIntegrator.enrich_query_context() Fetch semantic concepts and past sessions
Expansion aggregation RRF accumulation in rag_v2_api Add score when the same doc_id appears in multiple expansions
Memory reranking RAGMemoryIntegrator.compute_memory_boost() Add memory_boost into final_score
Feedback storage RAGMemoryIntegrator.record_feedback() Store reward = rating/5 through the long-term memory path
External memory management /api/memory/* REST API for non-RAG clients and operations

Extended Response Format:

{
  "status": "success",
    "query": "Project EV-2024-001 progress report",
    "language": "en",
    "session_id": "rag_20260520_143022_001",
  "results": [
        {
            "id": "DOC_A",
            "rank": 1,
            "score": 0.259,
            "memory_boost": 0.146,
            "source": "rrf+memory",
            "text": "Project EV-2024-001 is...",
            "metadata": {"project_id": "EV-2024-001"}
        }
  ],
  "memory_context": {
    "semantic_concepts": ["EV-2024-001", "Q-PFC Loop"],
        "past_sessions_used": 2,
        "entity_boost_enriched": {"EV-2024-001": 7.05}
    },
    "execution_time_ms": 187
}

Feedback Response:

{
    "status": "success",
    "feedback_id": "fb_12345",
    "memory_id": "ep_20260520_143155",
  "importance": 0.72,
    "recorded_at": "2026-05-20T14:31:55Z"
}

5. Search Algorithm

5.1 Language-Specific Search Strategy

English Query Case

Query: "project report"
       ↓
[Language Detect] → "en"
       ↓
[Normalize] → No Japanese processing
       ↓
[Tokenize] → ["project", "report"]
       ↓
[Dual Search with English Analyzer]
  - Elasticsearch: Standard analyzer
  - Milvus: Same as above
       ↓
[RRF Fusion] → Merged ranking

5.2 Ranking Algorithm

Score Computation:

rrf_score(doc) = Σ 1 / (60 + rank_i)
base_score = rrf_score * entity_multiplier

Where:
    rank_i is the rank from each BM25 / vector / query-expansion path
    entity_multiplier is the maximum SemanticMemory-enriched entity boost

5.3 Memory-Enhanced Ranking

v3.0 adds a memory layer to RRF scoring for dynamic boosting of previously high-rated documents.

Scoring Formula (v3.0)

final_score = rrf_score * entity_boost_enriched + memory_boost
Variable Description Range
rrf_score RRF fusion score (k=60) 0 to 1
entity_boost_enriched Boost factor after SemanticMemory enrichment 1.0 to 10.0
memory_boost Score correction from past sessions; alpha=0.2 is already applied internally 0 to 0.3

memory_boost Calculation

def compute_memory_boost(
    doc_id: str,
    past_sessions: List[Dict[str, Any]],
    alpha: float = 0.2,
) -> float:
    """
    Add score from high-rated documents in past similar sessions.

    Args:
        doc_id: Target document ID
        past_sessions: Results from retrieve_similar_sessions()
        alpha: Memory score weight

    Returns:
        memory_boost: Additive score (0.0 to 0.3)
    """
    boost = 0.0
    for session in past_sessions:
        if doc_id in session.get("doc_ids", []):
            # Weight by past rating (reward) × similarity (score)
            boost += session["reward"] * session["score"] * alpha
    return min(boost, 0.3)  # Cap at 0.3 to prevent over-amplification

Memory-Enhanced Ranking Flow

flowchart TD
    RRF_SCORES["RRF Scores\n{DOC_A: 0.016, DOC_B: 0.015, DOC_C: 0.012}"]
    SM_BOOST["SemanticMemory\nenriched_boost\n{EV-2024-001: 7.05}"]
    EP_SESSIONS["EpisodicMemory\npast_sessions\n[{doc_ids:[DOC_A,DOC_B], reward:0.8, score:0.91}]"]

    RRF_SCORES --> APPLY_BOOST["Apply entity_boost\nDOC_A: 0.016 × 7.05 = 0.113"]
    SM_BOOST --> APPLY_BOOST

    APPLY_BOOST --> APPLY_MEM["Add memory_boost\nDOC_A: 0.113 + (0.8×0.91×0.2) = 0.259\nDOC_B: 0.106 + (0.8×0.91×0.2) = 0.252"]
    EP_SESSIONS --> APPLY_MEM

    APPLY_MEM --> FINAL["Final Ranking\n1st: DOC_A (0.259)\n2nd: DOC_B (0.252)\n3rd: DOC_C (0.085)"]

6. Evaluation & Testing

6.1 Test Cases

Test Set Composition:

Category Cases Examples
Text Variations 15 "盆休み", "お盆休み"
Named Entities 30+ "EV-2024-001", "John Doe"
Total 50+ -

6.2 Evaluation Metrics

Metric Target Definition
MRR > 0.7 Mean Reciprocal Rank for variations
Recall > 0.8 Entity recall rate
Precision > 0.8 Entity precision rate
F1 Score > 0.75 Harmonic mean of recall/precision

7. Deployment

7.1 Environment Requirements

Server:

CPU: 8 cores
Memory: 32GB
Storage: 500GB
GPU: Optional (CUDA 11.8+)

Software:

Python: 3.10+
Elasticsearch: 8.0+
Milvus: 2.3+

7.2 Installation

pip install -r requirements.txt
docker build -t rag-v2:latest .
docker push registry/rag-v2:latest

8. Troubleshooting

Q: MRR < 0.5

Solution: Upgrade Sudachi, adjust weights, enable Query Expansion

Q: Entity Recall < 0.8

Solution: Verify NER model, check Elasticsearch mapping, increase entity boost

Q: Query latency > 500ms

Solution: Disable Query Expansion, reduce Milvus probe count


9. Appendix

9.1 Dependencies List

RAG Pipeline Dependencies:

sudachipy==0.7.5
sudachipy-dict-small==20240716
tner==1.8.0
elasticsearch==8.6.2
pymilvus==2.4.1
sentence-transformers==2.2.2
transformers==4.34.0
pandas==2.0.0
numpy==1.23.0
pyyaml==6.0
torch>=2.0.0

EvoSpikeNet Core SDK Dependencies (v3.0 Added):

evospikenet/episodic_memory.py     # EpisodicMemory, SemanticMemoryEntry
evospikenet/long_term_memory.py    # LongTermMemoryModule
evospikenet/forgetting_controller.py  # ForgettingController
evospikenet/snn_memory_extension.py   # LargeScaleSpikeReservoir
evospikenet/api_modules/memory_api.py  # REST API endpoints

Configuration Files (New):

config/memory_config.yaml    # RAGMemoryIntegrator configuration
config/domain_terms.yaml     # Domain vocabulary for SemanticMemory loading

Core SDK Source References: - evospikenet/episodic_memory.py - EpisodicMemory, SemanticMemoryEntry, EpisodicMemoryEntry classes - evospikenet/long_term_memory.py - LongTermMemoryModule, RetentionSummary classes - evospikenet/api_modules/memory_api.py - REST endpoint implementations


Document Version: 3.0
Last Updated: 2026-05-22
Approved By: RAG Development Team
Status: Implemented with Operational Hardening
v3.0 Changes: EvoSpikeNet Core memory system (SemanticMemory / EpisodicMemory / LongTermMemoryModule) integration added. Transition to Memory-Augmented RAG architecture.
2026-05-22 Update: Reflected the RAG v2 preprocessing health endpoint, fail-closed runtime policy, QueryExpander quality guards, and observability updates.