Feedback-Based RAG: Self-Improving Retrieval with User Signals
1 de noviembre de 202513 min
Use explicit and implicit user feedback to rerank, retrain, and continuously improve RAG quality. Expect steady NDCG gains month over month.
TL;DR
Collect explicit (ratings) and implicit (clicks, dwell) signals, convert them to rewards, rerank in real time, and batch‑retrain weekly for durable gains.
What is Feedback-Based RAG?
A RAG system that learns from usage. User feedback updates retrieval scores/rerankers online and via periodic retraining.
Feedback Signals
- Explicit: thumbs, stars, helpful
- Implicit: clicks, dwell >30s, copy/share
- Negative: immediate back/ refine query
- Combined score: 0.6×explicit + 0.4×implicit
Update Mechanisms
- Online: EMA update of retrieval scores (fast, noisy)
- Offline: daily aggregation + weekly reranker retrain
- Cadence: A/B test and roll back if metrics regress
Data Requirements
- Log: user_id, query, doc_ids, scores, rating, timestamp
- Thresholds: ≥100 feedback samples before retraining
- Temporal decay: weight recent feedback higher (exp. decay)
Retrieval Configuration
- Base: BM25 + Dense (e.g., Contriever)
- Feedback layer: fine‑tuned reranker on feedback data
- Cold start: content similarity until ≥50 ratings
- Ensemble: 0.5×base + 0.3×feedback + 0.2×recency
Environment Variables
FEEDBACK_DB_URL=postgresql://… | MIN_FEEDBACK_COUNT=50 | DECAY_RATE=0.95 | UPDATE_SCHEDULE="0 2 * * *"
Guardrails
- Outlier detection and velocity limits
- Honeypots to catch bots; rate limiting
- Temporal decay so recent feedback counts more
Cost Model (100K monthly queries)
- Storage: ~100MB/month ≈ $0.01
- Daily batch: ≈ $3/month
- Weekly reranker retrain: ≈ $20/month
- Total: ≈ $23/month
Benchmarks (Illustrative)
- NDCG@10: +8–16% over 4–8 weeks
- Coverage: % results with ratings ↑
- Conversion metrics (positive action) ↑
NDCG Weekly Deltas (Example)
- Week 0: 0.41
- Week 1: 0.45 (+0.04)
- Week 2: 0.48 (+0.03)
- Week 4: 0.51 (+0.03)
Security & Privacy
- Strip PII; store aggregates where possible
- GDPR: support delete requests
- Audit logs: retain original ratings separate from aggregates
Ablation (Illustrative)
- Explicit only: +8.2% (low noise)
- Implicit only: +5.1% (med noise)
- Combined: +12.4% (med noise)
- + Temporal decay: +14.7% (low noise)
Minimal n8n Workflow
Webhook → DB insert (feedback) → Daily aggregate → Weekly retrain reranker → Update vector metadata → A/B test.
Real‑World Fits
- E‑commerce search: rank by purchases/engagement
- Docs/support: surface articles that resolve issues
- Academic: rank by saves/citations
Artículos relacionados
¿Te gustó este artículo?
Sígueme para más recursos sobre RAG y N8N workflows.
Contáctame