Recommend Arena.
Twelve recommendation-system designs benchmarked head-to-head on the same dataset and ground-truth queries.
Python/ChromaDB/NetworkX/Claude Code
What it is
Twelve recommender architectures — knowledge graph, pure embedding, LLM-as-judge, hybrid, TF-IDF, Bayesian, and more — implemented against a single shared protocol and evaluated on the same dataset of sports gear. Same queries in, same ground truth out, NDCG and MRR decide.