Volver al Blog

Feedback-Based RAG: Self-Improving Retrieval with User Signals

1 de noviembre de 202513 min

Use explicit and implicit user feedback to rerank, retrain, and continuously improve RAG quality. Expect steady NDCG gains month over month.

TL;DR

Collect explicit (ratings) and implicit (clicks, dwell) signals, convert them to rewards, rerank in real time, and batch‑retrain weekly for durable gains.

What is Feedback-Based RAG?

A RAG system that learns from usage. User feedback updates retrieval scores/rerankers online and via periodic retraining.

Feedback Signals

  • Explicit: thumbs, stars, helpful
  • Implicit: clicks, dwell >30s, copy/share
  • Negative: immediate back/ refine query
  • Combined score: 0.6×explicit + 0.4×implicit

Update Mechanisms

  • Online: EMA update of retrieval scores (fast, noisy)
  • Offline: daily aggregation + weekly reranker retrain
  • Cadence: A/B test and roll back if metrics regress

Data Requirements

  • Log: user_id, query, doc_ids, scores, rating, timestamp
  • Thresholds: ≥100 feedback samples before retraining
  • Temporal decay: weight recent feedback higher (exp. decay)

Retrieval Configuration

  • Base: BM25 + Dense (e.g., Contriever)
  • Feedback layer: fine‑tuned reranker on feedback data
  • Cold start: content similarity until ≥50 ratings
  • Ensemble: 0.5×base + 0.3×feedback + 0.2×recency

Environment Variables

FEEDBACK_DB_URL=postgresql://… | MIN_FEEDBACK_COUNT=50 | DECAY_RATE=0.95 | UPDATE_SCHEDULE="0 2 * * *"

Guardrails

  • Outlier detection and velocity limits
  • Honeypots to catch bots; rate limiting
  • Temporal decay so recent feedback counts more

Cost Model (100K monthly queries)

  • Storage: ~100MB/month ≈ $0.01
  • Daily batch: ≈ $3/month
  • Weekly reranker retrain: ≈ $20/month
  • Total: ≈ $23/month

Benchmarks (Illustrative)

  • NDCG@10: +8–16% over 4–8 weeks
  • Coverage: % results with ratings ↑
  • Conversion metrics (positive action) ↑

NDCG Weekly Deltas (Example)

  • Week 0: 0.41
  • Week 1: 0.45 (+0.04)
  • Week 2: 0.48 (+0.03)
  • Week 4: 0.51 (+0.03)

Security & Privacy

  • Strip PII; store aggregates where possible
  • GDPR: support delete requests
  • Audit logs: retain original ratings separate from aggregates

Ablation (Illustrative)

  • Explicit only: +8.2% (low noise)
  • Implicit only: +5.1% (med noise)
  • Combined: +12.4% (med noise)
  • + Temporal decay: +14.7% (low noise)

Minimal n8n Workflow

Webhook → DB insert (feedback) → Daily aggregate → Weekly retrain reranker → Update vector metadata → A/B test.

Real‑World Fits

  • E‑commerce search: rank by purchases/engagement
  • Docs/support: surface articles that resolve issues
  • Academic: rank by saves/citations

Artículos relacionados

¿Te gustó este artículo?

Sígueme para más recursos sobre RAG y N8N workflows.

Contáctame