Evaluation SDK design/implementation guide
[!NOTE] For the latest implementation status, please refer to Functional Implementation Status (Remaining Functionality).
Overview
EvoSpikeNet's evaluation system (Evaluator) is an interface for quantifying the performance of individuals and teams in optimization processes such as evolutionary algorithms and reinforcement learning.
BaseEvaluatoris an abstract class, andevaluate_competitive(competition between individuals) andevaluate_team(team evaluation) must be implemented.- For production operation, you can implement at least one concrete class such as
DefaultEvaluatorand add your own evaluation strategy class depending on the purpose.
Implementation requirements
- All Evaluators should inherit from
BaseEvaluatorand implement two methods:evaluate_competitive(genome1, genome2) -> Dict[str, float]evaluate_team(team: List[Any]) -> Dict[str, Any]
- The evaluation function can describe logic according to the purpose, such as brain instantiation, forward calculation, score calculation, etc.
- It is recommended to provide multiple Evaluator implementations as SDK samples in order to switch and expand evaluation strategies.
Sample implementation
DefaultEvaluator: Score with L2 norm of brain forwardCooperativeEvaluator: Evaluate based on the average score of the entire teamCustomEvaluator: Use any external metric or complex reward function
Implementation example
from evospikenet.evaluators import BaseEvaluator
class CooperativeEvaluator(BaseEvaluator):
def evaluate_competitive(self, genome1, genome2):
# Competitive evaluation (e.g. winner determined by total score)
...
def evaluate_team(self, team):
# Average score for the whole team
scores = [self._score(g) for g in team]
mean_score = sum(scores) / len(scores)
return {"team_score": mean_score, "individual_scores": scores}
def _score(self, genome):
# Score calculation using brain forward etc.
...
Notes
- In the actual implementation of the evaluation system, returning only random numbers and dummy values is prohibited, and a meaningful score calculation must be performed.
- SDK users can add or replace their own Evaluators according to their needs.