Alt6 Scoring Pipeline — Execution Sequence
Exact order of operations. Each numbered step completes before the next begins unless noted.
━━ PHASE A: SESSION-WIDE CACHES (startup only, locked after first run) ━━
A1. Compute altDecay raw values per source (needed by cache)
A2. buildGlobalPercentileCache()
Collect ALL sources (first 100 per question) across all questions.
Build sorted arrays of raw scores: cross, bm25, sem, decay, dr, altDecay1–5.
These arrays are the reference pool for all percentile computations.
→ LOCKED: globalPercentileCacheBuilt = true (never rebuilt)
A3. buildRelevanceDistributionCache()
For every source in the pool, compute Relevance via computeRelevance():
Relevance = wCross × pCross + wBM25 × pBM25 + wSem × pSem
Sort all Relevance values → used for RelevancePct percentile lookups.
→ LOCKED: relevanceDistributionCacheBuilt = true (never rebuilt)
━━ PHASE B: PER-QUESTION SETUP (runs for each question) ━━
B1. getEffectiveDecayParams() → determine temporal subclass + cascade fallback
B2. checkTahRecencyOverride() → if TAH anchor/window is recent → use BRT or RWR decay
B3. extractTemporalCues() → extract target year(s) from question text / time window
B4. checkOpenEndedRange() → detect start-only ER → compute synthetic window end
B5. checkTahFallback() → no dates or end-only → route to fallback subtype
B5b. Reference-Aware Reroute (if mode enabled):
shouldRerouteToCrbn() → checks category, classification, anchor age, keywords
Applies to: EA, ER, Comparison, and Fallback (TahFbFresh/Topical only)
If triggered: → RefAware decay (no anchor/window, age-based decay t½=180d floor=0.70)
Investigative questions: only rerouted when anchor age > Inv threshold (default 100yr)
Per-tile override can force standard or force reroute regardless of detection.
B6. classifyRelationshipQuery() → detect relationship entities + patterns (once per question)
B7. Resolve effective subclass: override → reroute → fallback → cascade (in priority order)
B8. computeQRanksForQuestion() → rank each source by Relevance within this question
━━ PHASE C: PER-SOURCE SCORING (runs for each source within each question) ━━
C1. Relevance signal percentiles (from global cache A2):
pCross = getCrossPct(source) ← cross-encoder raw → global percentile
(falls back to semantic with penalty if cross is null)
pBM25 = getBm25Pct(source) ← BM25 raw → global percentile (0 if null)
pSem = getSemPct(source) ← semantic raw → global percentile (0 if null)
C2. Relevance = wCross × pCross + wBM25 × pBM25 + wSem × pSem
If pCross is null (no cross-encoder AND no semantic) → source excluded (null).
C3. RelevancePct = percentile_rank(Relevance) over session pool (from cache A3)
Empirical CDF with midrank tie handling. Frozen at startup values.
C4. Age & Decay:
ageInDays = (questionAskedAt − published_at) in days
DecayFactor = max(floor, e^(−ln2 × λ × ageInDays / halfLife))
Uses effective subclass curve from B7.
RefAware reroute: uses RefAware curve (default t½=180d, floor=0.70) — gentle freshness preference.
Standard TAH EA/ER: DecayFactor=1.0 (age decay disabled, anchor/window factors used instead).
C5. TAH Temporal Factors (at most one is non-1.0 per question):
AnchorFactor = anchor-centered decay (EvtAnch only, else 1.0)
WindowFactor = window compliance (ExpRng only, else 1.0)
TemporalCompat = year-presence check (TAH only, else 1.0)
RefAware reroute: AnchorFactor=1.0, WindowFactor=1.0 (both disabled — relevance dominates).
C6. RelBoost = relationship evidence multiplier (1.00 or 1.13)
C6b. EntityPresence = boost/penalty based on question entity presence in source text (1.0 if disabled/no entities)
C7. ASSEMBLY:
Alt6 = RelevancePct × DecayFactor × AnchorFactor × WindowFactor × TemporalCompat × RelBoost × EntityPresence
C8. Store all computed values on source object (pCross, pBM25, pSem, relevance,
relevancePct, ageInDays, decayFactor, atFloor, relEvidence, temporalCompat, etc.)
━━ PHASE D: RANKING (per question, after all sources scored) ━━
D1. Sort all sources by Alt6 descending → assign altScore6_rank (1-based).
Sources with null Alt6 get null rank.
━━ PHASE E: WEAK-CLUSTER RESCUE (per question, after D1 ranking) ━━
E1. Identify benchmark group = top-8 by Alt6 rank (from D1).
E2. Compute semantic ranks across ALL sources (by raw semantic, descending).
E3. Compute benchmark semantic stats: mean, P75, min, count.
E4. Stage 1 — Weak Cluster Detection:
For each non-benchmark source, check 3 gates:
Gate 1: semantic > benchP75 (safety net — rarely binding)
Gate 2: semantic > semFloor (safety net — 0.50 absolute floor)
Gate 3: semantic > benchMean + lift (primary gate — default 0.18)
Any pass → weak cluster detected.
E5. Stage 2 — Candidate Qualification:
Stage 1 passers must also have semRankAll ≤ rankCap (default 5).
Top candidates (up to poolCap=4) by semantic → rescue pool.
E6. If rescue_active:
Pool = benchmark(8) + qualified candidates (up to 4) = up to 12 members.
Rescue rerank with semantic-led weights:
wcrAlt6 = (0.75×pSemLocal + 0.25×pBM25 + 0.00×pCross) × decay × anchor × window × compat × relBoost
Sort pool by wcrAlt6 → top sendCount (default 6) are "sent".
E7. OVERWRITE: Pool members' altScore6 replaced with wcrAlt6.
RERANK: Pool members get ranks 1..poolSize by wcrAlt6.
Non-pool members get ranks poolSize+1.. by original order.
━━ PHASE F: RECALCULATION (config change — differs from startup) ━━
Same as Phases B–E, BUT:
• Phase A caches are NOT rebuilt (locked at startup values).
• pCross, pBM25, pSem global percentiles do not change.
• RelevancePct distribution does not change.
• Only the weights (wCross/wBM25/wSem), decay params, and WCR config take effect.
• Temporal grid is NOT recomputed — only expanded source sections re-render.
Alt6 Composite Score Formula
Alt6 = RelevancePct × DecayFactor × AnchorFactor × WindowFactor × Compat × RelBoost
Only one of AnchorFactor or WindowFactor is ever non-1.0 for a given question (they apply to different TAH subclasses). RelBoost applies only to relationship questions (1.00–1.13).
Relevance (3-signal blend)
Relevance = wCross × pCross + wBM25 × pBM25 + wSem × pSem
- pCross = cross-encoder percentile (session-pool). Falls back to semantic with penalty if cross-encoder is null.
- pBM25 = BM25 keyword percentile (session-pool). 0 if null.
- pSem = semantic bi-encoder percentile (session-pool). 0 if null.
- Defaults: wCross=0.75, wBM25=0.125, wSem=0.125. Configurable in Global Controls (weights should sum to 1.0).
- RelevancePct = percentile rank of Relevance over the session pool (empirical CDF, cached at startup).
Exponential Half-Life Decay (BRT, RWR, CRBN, UNKNW)
For standard recency-based subclasses, decay measures how old a source is relative to when the question was asked:
DecayFactor = max(floor, e-ln2 × age_days / halfLife)
- age_days = days between source publication and question ask time
- halfLife = the number of days at which the factor drops to exactly 0.50 (50%)
- floor = minimum factor — even very old sources get at least this weight
- Example: With halfLife=7d, floor=0.20: a 7-day-old source gets factor=0.50, a 14-day-old source gets factor=0.25, a 21-day-old source gets factor=0.20 (floor)
EvtAnch/TAH — Anchor-Centered Scoring
For event-anchored questions (e.g., "What happened in the 2024 election?"), sources are scored by proximity to the event date, not by recency:
AnchorFactor = max(floor, e-ln2 × |pub_date - anchor_date| / halfLife)
- anchor_date = the event date (event_date preferred, fallback to tw_start)
- |pub_date - anchor_date| = distance in days between source and event (absolute value — before or after)
- DecayFactor is set to 1.0 (no recency decay) for TAH questions
- Example: With halfLife=10d, floor=0.30: a source published on the event date gets 1.0, 10 days away gets 0.50, 20 days away gets 0.30 (floor)
ExpRng/TAH — Window Compliance Scoring
For explicit-range questions (e.g., "African footballers August 2025"), sources are scored by whether they fall within the requested time window:
WindowFactor = 1.0 (if pub_date is within [tw_start, tw_end])
WindowFactor = max(floor, e-ln2 × d_boundary / halfLife) (if outside)
- tw_start, tw_end = time window boundaries from classifier
- d_boundary = distance in days to the nearest window boundary (not the midpoint)
- Sources inside the window always get factor=1.0 — no penalty for position within the window
- Boundary-inclusive: sources published on tw_start or tw_end are considered in-window
- Position labels: IN = inside window, BEF = before window, AFT = after window, UNK = unknown (estimated date)
- Example: Window [2025-08-01 to 2025-08-31], halfLife=180d, floor=0.27: a source from Aug 15 gets 1.0 (in-window), one 180d before start gets ~0.50, one 2yr out gets ~0.27 (floor)
Estimated-Date Penalty (EvtAnch + ExpRng)
Some sources have no real published_at date. The backend falls back to article_inserted_at (the crawl/insert date) and marks published_at_estimated = true. These crawl dates have no meaningful temporal relationship to the content.
For TAH temporal scoring (EvtAnch and ExpRng), estimated-date sources still go through normal distance-based computation (using the crawl date), then receive a multiplicative uncertainty penalty on top:
Factor = normalFactor × (1.0 - estimatedDatePenalty)
- Default penalty: 0.20 (multiplier = ×0.80) — configurable per TAH type in the "Est. Penalty" fields above
- Tooltips show
× EstPenalty(0.80) alongside distance-based factors
- ExpRng position badge: UNK instead of IN/BEF/AFT
- Audit panes track unknown-date sources as a separate bucket — the "Unknown" position in ExpRng charts and "A6 Unk" / "P Unk" columns in both audit grids
- Anchor distances and boundary distances exclude estimated-date sources (they would be meaningless)
TAH Recency Override (BRT/RWR)
When a TAH question's anchor/window is very recent, BRT or RWR decay is used instead of anchor/window scoring:
- For EvtAnch: recency measured from anchor date (
tw_start) to ask time
- For ExpRng: recency measured from window end date (
tw_end) to ask time
- If recency ≤ BRT age-to-floor → BRT override (labels: EA→BRT, ER→BRT)
- If recency ≤ RWR age-to-floor → RWR override (labels: EA→RWR, ER→RWR)
- Override disables anchor/window factor — standard recency decay applies instead
- Thresholds derived from BRT/RWR config (change half-life/floor → thresholds update automatically)
- Purpose: protect recent-news questions while allowing more forgiving TAH baseline parameters for genuinely historical events
- Shown as blue badges in the UI
Relationship Boost v2 (All Subclasses)
When a question asks about the relationship between entities (influence, involvement, evolution, comparison, causality), sources are scored based on evidence of that relationship:
- Entity extraction (v2 — precision-first): Three extraction sources combined: (1) comparison templates ("X vs Y", "between X and Y"), (2) proper-noun phrases (up to 6 raw candidates with mixed-span splitting), (3) alias/gazetteer scan (~100-entry map of teams, universities, events, agencies). Candidates validated against reject patterns (durations, verb phrases, fragments). At most 2 entities selected.
- Alias canonicalization: Known aliases (e.g., "Pats" → "New England Patriots", "Oscars" → "Oscars") resolved via longest-prefix match. Alias hits prioritized during entity selection and always pass validation.
- Best-pair matching: Evaluates all entity pairs per source — picks the pair with best co-occurrence + relationship evidence. Alias-aware matching: checks all aliases resolving to the same canonical name. Tail-word matching for 3+ word entities.
- Scoring — neutral/boost only: Co-mention-only → neutral (1.00). Both entities + any relationship evidence term → boost (1.13). No penalty tier, no compatibility matrix, no weak/strong distinction.
- Only activates for questions with 1+ extractable entity and a detected relationship pattern
- Diagnostic:
diagnoseRelationshipBoost() in browser console — reports extraction modes (pair/single/none), boost/neutral tier counts, top-8 impact
Weak-Cluster Semantic Rescue (WCR)
Two-stage rescue system that detects when the benchmark top-8 contains a semantically weak cluster, then promotes high-semantic buried candidates via rescue-state reranking.
- Stage 1 — Weak Cluster Detection (2 active gates + safety nets): For each source NOT in the benchmark top-8, checks: (1) semantic > benchmark mean + lift (default 0.18 — primary gate), (2) semantic rank across all sources ≤ rank cap (default 5). Two hardcoded safety nets (P75 gate at 75th percentile, absolute floor at 0.50) provide backstop filtering but are set permissively and rarely bind. All checks must pass. If any buried source passes → weak cluster detected.
- Stage 2 — Rescue Pool & Reranking: Rescue pool = benchmark top-8 + qualified candidates (up to pool cap, default 4). Pool members are re-scored with semantic-led weights:
wcrWtSemantic × pSemanticQueryLocal + wcrWtBM25 × pBM25 + wcrWtCross × pCross (defaults: 0.75/0.25/0.00). Top sources by rescue score become the new ranking.
- Three outcomes:
no_weak_cluster (no buried source passes gates), weak_cluster_no_candidates (gates passed but rank cap filtered all), rescue_active (pool formed and reranked)
- Integrated ranking: When rescue is active, pool members' altScore6 values are overwritten with rescue scores. Non-pool sources keep their base scores. This is the default view — a "base view" toggle is available for diagnostics.
- Send count: Rescue-active questions send top-6 (configurable) to LLM instead of top-8
- Null-safe: Sources with null semantic are excluded from benchmark stats computation (mean, P75) and never qualify as candidates. Requires minimum 4 non-null semantic values in benchmark (internal threshold).
- Diagnostics: CSV columns
wcr_state, wcr_in_benchmark, wcr_weak_cluster, wcr_is_candidate, wcr_in_pool, wcr_pool_rank, wcr_sent_to_llm, wcr_gate_result, wcr_semantic, wcr_sem_rank_all, wcr_bench_mean_sem, wcr_bench_p75_sem, wcr_bench_sem_count, wcr_sem_minus_mean, wcr_sem_minus_p75, wcr_rel_pct, wcr_alt6, wcr_base_alt6, wcr_base_rank
- Toggle on/off via the "Weak-Cluster Rescue" panel in the Alt6 config section. Default: ON (as of 2026-03-20)
Key Concepts
- Half-life controls decay speed: larger = slower decay, smaller = faster. At exactly one half-life distance, the factor equals 0.50.
- Floor prevents total suppression: even sources far from the target time contribute at minimum this weight.
- Estimated-Date Penalty handles sources with no real publication date — they remain eligible but receive a mild fixed penalty instead of meaningless distance-based scoring.
- Temporal Compatibility (TAH only) — checks whether target year(s) appear in article title/description. Year source priority: time window (
tw_start/tw_end) is primary; query text regex is fallback. Cross-year windows (e.g. 2019–2020) produce multiple valid target years. Three tiers: any target year found → Match Boost (default 1.15); different year found → Mismatch Penalty (default 0.80); no year → 1.0 neutral. Only applies to TAH subclasses (event_anchored, explicit_range, comparison) — BRT/RWR/CRBN already have recency decay. Both values tunable in config row above.
- TAH Fallback Subtypes — triggers when no dates are extracted (both tw_start and tw_end null), or when only tw_end is present (end-only ER). Classified into 4 subtypes with subtype-specific decay: FRESH (t½=18d), TOPICAL (t½=50d), EVERGREEN (t½=90d), COMPARATIVE (t½=120d). Labels combine prefix + subtype: EA→FB/Fresh, ER→FB/Topical, etc. Each appears as its own row in the Temporal Subclass Breakdown stats table.
- Synthetic Window / Since — for start-only ER questions (tw_start present, tw_end missing). Two modes:
- SynWin — synthetic end date at
tw_start + (today − tw_start) × windowPct. Default 20%. Configurable. Label: ER→SynWin.
- SinceNow — if question contains "since" + a year/date, window extends to today. Label: ER→SinceNow.
In-window = 1.0, outside-window penalty applies. Shown as indigo badge.
- All date comparisons use calendar-day granularity (hour/minute differences are ignored).
Entity Presence Boost (All Subclasses)
When a question contains extractable person names (full names like "Trey Anastasio"), sources are scored based on whether those entities appear in the source text:
- Entity extraction: Sequences of 2+ capitalized words extracted from question text. Common stop words excluded. Full name match required (not partial).
- Boost tiers: All entities found in title → ×1.20, in description → ×1.12, in article content → ×1.10.
- Graduated penalties by entity count:
- 1 entity, none found → ×0.60
- 2 entities, 1 of 2 found → ×0.90, none found → ×0.50
- 3+ entities, majority found → ×0.95, minority → ×0.70, none → ×0.40
- Partial matches: Some but not all entities found → graduated neutral/mild penalty.
- Content check: Title/description checked first (fast). If no match found, article content fetched asynchronously from
article_contents for degraded conditions. In prod, content check runs synchronously for all sources.
- Toggleable: Can be enabled/disabled via the Entity Presence Boost config checkbox. Applies to both standard Alt6 and Alt6-R (WCR rescue) scoring.
- Formula integration:
Alt6 = RelevancePct × DecayFactor × ... × EntityPresence
Complete Scoring Formula
Alt6 = RelevancePct × DecayFactor × AnchorFactor × WindowFactor × TemporalCompat × RelBoost × EntityPresence
Where exactly one temporal mechanism is active (Decay, Anchor, or Window), and EntityPresence is the entity name-match multiplier (1.0 when disabled or no entities extracted).