Killed Signals

Hypotheses the Observatory tested and formally retired. A research programme that never kills anything is not doing science.

A research programme that never kills anything is not doing science. It is collecting confirmations.

The Observatory has formally evaluated over 400 independent hypotheses. Of these, over 120 have been permanently retired — killed after failing the primary test, a fatal confound, or a formal multi-test battery that single-pass validation did not catch. What follows is a selection of the more scientifically instructive kills.

Not all kills are equal. The cases here are tagged by failure mode:

  • PURE NOISE — no signal at any threshold, mechanism physically incoherent
  • DIRECTIONAL INVERSION — real signal, but the effect runs opposite to the original hypothesis
  • CONFOUNDED — real correlation, but explained by a third variable, not the proposed driver
  • ADAPTIVE RESPONSE — mechanism real, market or biological system neutralised the effect
  • ERA-SPECIFIC — held in one historical regime, did not generalise
  • UNDERPOWERED — plausible direction, insufficient n to confirm

The kill rate across all evaluated hypotheses is approximately 30%. The trading-signal battery applied in April 2026 produced a higher rate — 44% — because it applies a stricter multi-test standard than single-pass confirmation. We consider both rates a quality signal, not a problem.

A selection note: the cases shown here are not the easiest kills. We highlight the hard cases — plausible mechanisms, coherent hypotheses, null results — because they better illustrate what the process is actually doing.


Geomagnetic activity → financial market volatility CONFOUNDED

Hypothesis: Elevated geomagnetic storm activity (Kp index, Dst index) drives measurable increases in equity market volatility (VIX) through physiological stress pathways in traders.

Why it seemed plausible: Laboratory studies show geomagnetic storms affect melatonin secretion and autonomic nervous system function. If this influences decision-making under stress, a market signal might follow.

What the tests showed:

  • Dst - VIX: r = 0.024, p = 0.62. No relationship.
  • Bz component - VIX: r = 0.025, p = 0.61. No relationship.
  • Solar wind velocity (Vsw) partially mediates Kp - VIX, but the effect is temporally unstable and the solar/business-cycle confound cannot be separated.
  • Five independent tests (Dst, Kp, Bz, solar wind, sunspot count) all returned null against financial outcomes across monthly data from 1990 to 2023.

A note on the literature: Some published papers - including a widely cited 2003 study - report geomagnetic effects on equity returns. The Observatory’s verdict diverges from these because our tests controlled explicitly for the solar/business-cycle confound (geomagnetic activity and economic volatility share a common driver in the 11-year solar cycle) and applied surrogate significance testing. The positive papers did not. We are not dismissing that literature; we are reporting what happens when you add those controls. The effect disappears.

Verdict: KILLED. The mechanism operates through physical climate systems (cloud cover, temperature, precipitation). It does not operate through trader psychology. The EM - to - financial pathway is one of the clearest cognitive bias artifacts in the Observatory: humans intuitively reach for a “cosmic forces affect human behaviour” explanation. The data consistently refuses it.

What survives: The physical climate pathway (EM - cloud - agriculture) is confirmed. The psychological market pathway is not.


Moon phase → equity returns PURE NOISE

Hypothesis: Lunar phase cycles produce statistically significant patterns in equity market returns, mediated by circadian rhythm disruption or investor sentiment effects.

Why it seemed plausible: Numerous academic papers claimed lunar effects on stock markets. The mechanism had biological plausibility (melatonin, sleep quality). The 29.5-day cycle is short enough to accumulate many observations.

What the tests showed:

  • Surrogate control p-value: 0.531. The apparent lunar signal is indistinguishable from randomised surrogate series with the same autocorrelation structure.
  • The published academic results failed to survive Bonferroni correction for the number of markets and time windows tested.
  • No mechanism connecting lunar phase to investor cognition survived scrutiny: modern indoor lighting decouples melatonin production from actual moonlight.

Verdict: KILLED. The published literature on lunar - equity effects is a multiple comparisons artifact. The surrogate control is the decisive test.


Pollinator decline → crop yield reduction ADAPTIVE RESPONSE

Hypothesis: Documented declines in wild pollinator populations produce measurable crop yield reductions across insect-pollinated commodity crops.

Why it seemed plausible: The biological mechanism is unambiguous - roughly 75% of flowering crops depend on animal pollination. Wild bee decline is well-documented. The causal pathway from colony collapse to yield reduction should be detectable.

What the tests showed:

  • Managed honeybee colony numbers have increased 45% globally since 1961, directly substituting for wild pollinator decline.
  • Crop yield data shows no detectable signal attributable to pollinator decline after controlling for managed hive availability.
  • The mechanism is real; the market signal is neutralised by commercial beekeeping’s adaptive response.

Verdict: KILLED as a commodity signal. The ecology is correct; the economics neutralises it. This is a useful lesson in the difference between a confirmed biological mechanism and a tradeable signal: the market can adapt faster than the mechanism propagates.


Schumann resonance → biological effects PURE NOISE

Hypothesis: Earth’s Schumann resonance frequencies (7.83 Hz fundamental) affect human neurological function through electromagnetic coupling, producing measurable physiological or cognitive effects.

Why it seemed plausible: The Schumann resonance frequency overlaps with the human alpha brain wave range. Some researchers proposed resonant entrainment as a mechanism.

What the tests showed:

  • Atmospheric Schumann fields at ground level are approximately 0.3 picotesla. The Earth’s static geomagnetic field is approximately 50,000,000 picotesla - a ratio of 1 to 167 million.
  • No biophysical mechanism can explain selective coupling to the Schumann frequency at this field strength when the static field is eight orders of magnitude stronger.
  • All published positive findings failed to survive blinded replication under controlled shielding conditions.

Verdict: KILLED on mechanism grounds. The field strength is physically insufficient by many orders of magnitude. This is a case where the mechanism review (Validation Dimension 8) is decisive before statistical testing is even necessary.


Gleissberg solar cycle → epidemic periodicity CONFOUNDED

Hypothesis: The 87-year Gleissberg solar cycle modulates epidemic outbreak frequency in Chinese historical records through solar - climate - immune pathway interactions.

Why it seemed plausible: Chinese dynastic records contain unusually detailed epidemic documentation spanning multiple centuries. The Gleissberg cycle has documented climate effects. An immune system link had theoretical support.

What the tests showed:

  • Spectral analysis of the Chinese epidemic record returned no significant Gleissberg periodicity after controlling for population density and documentation intensity biases.
  • The apparent periodicity dissolved under surrogate testing.
  • The documentation intensity bias is severe: epidemic recording correlates strongly with dynastic administrative capacity, not epidemic frequency.

Verdict: KILLED (NULL). The source data has a systematic bias that mimics cycles. Any periodicity in the record reflects the cycles of Chinese bureaucratic capacity as much as epidemic biology.


What the kill list tells you

These five kills share a common structure: plausible mechanism, coherent hypothesis, null result. In each case, something specific ended the inquiry - a surrogate control, a field strength calculation, an adaptive market response, a data provenance problem.

The Observatory’s 8-step validation framework is designed to surface these failure modes before publication, not after. The devil’s advocate pass (Dimension 6) and the surrogate significance test (Dimension 4) between them account for four of the five kills above.

One thing this list cannot show: how we would have reported a case where the test returned a weak positive and we chose to proceed anyway. The kill list documents the clear failures. The harder editorial judgement - what to do with a r = 0.18, p = 0.03, surrogate-passing but small-effect signal - is not captured here. That is where the effect size weighting and consilience upgrade requirements do most of their work, and where the remaining epistemic risk lives.

The confirmed signals survived this process. The killed signals did not. Both facts matter.


April 2026 battery retest - trading signals

In April 2026 we applied a formal five-test finding-validator battery to every active trading signal on the public dashboard. The battery: Monte Carlo null, blind replication with era split, specificity against negative controls, tolerance sensitivity. Of sixteen signals tested, seven failed all four pass-fail criteria. Three were directional inversions where the observed effect ran opposite to the published claim. We document them here in the same spirit as the five cases above.

COT commercial extreme — corn, wheat, crude oil PURE NOISE · EUR/USD DIRECTIONAL INVERSION

Hypothesis: Extreme commercial hedger short positioning (z-score below minus two) signals contrarian buy opportunities.

Why it seemed plausible: Commercial hedgers are producers and end-users; when they are maximally short, the market prices in the most bearish scenario physical participants expect. Contrarian trades at these extremes have produced returns in academic literature.

What the tests showed:

  • Corn: 63.6% hit vs 53.1% null, specificity 1.2x, held-out z = 0.06.
  • Wheat: 42.9% hit vs 46.7% null - worse than random. Post-2010 held-out: 20% on five events.
  • Crude oil: 66.7% hit but null is 60.2% (oil has been in a secular bull phase). Specificity 1.11x.
  • EUR/USD: Directional kill. At extreme speculator-long positioning, EUR/USD rises 73% of the time at 8-week horizon, not falls.

Verdict: KILLED for corn, wheat, crude. EUR/USD directionally inverted. COT positioning as a contrarian signal survives only in soybeans (88% hit at z below minus two, but n = 17 over 39 years, 5 post-2010). We continue to track soybeans; the others leave the dashboard.

Sahm Rule → commodity demand destruction DIRECTIONAL INVERSION

Hypothesis: When the Sahm Rule triggers, commodity demand destruction follows and commodity prices decline over six to twelve months.

Why it seemed plausible: The Sahm Rule is a reliable recession indicator; recessions reduce industrial activity which drives commodity demand.

What the tests showed:

  • At Sahm >= 0.5: observed +1.1% vs null +2.1%, z-direction minus 0.5. No signal.
  • At Sahm >= 1.0: observed +7.2%, z in hypothesis direction minus 2.03.
  • At Sahm >= 1.5: observed +20.0% at 6m, z in hypothesis direction minus 5.61.

The more extreme the recession signal, the larger the subsequent commodity rally. The 2008 and 2020 Sahm triggers both produced massive commodity rallies - China stimulus after 2008, Federal Reserve quantitative easing and reflation after 2020.

Verdict: DIRECTIONALLY INVERTED. The published claim is directionally backwards. The real pattern appears to be that extreme recession signals precede policy response which precedes a commodity-friendly reflationary regime. Whether a reframed version of the claim survives its own devil’s advocate pass is a separate question we have not completed.

SIPRI military spending → commodity bull phases DIRECTIONAL INVERSION

Hypothesis: Global military spending surges (year-over-year growth above five per cent in real terms) lead commodity price booms by two to four years.

Why it seemed plausible: World wars produced commodity booms; Cold War build-ups drove sustained metals demand. NATO commitments of 2023 and 2024 looked like the start of another cycle.

What the tests showed:

  • Milex surge years (seventeen events, 1950-2024): +6.64% commodity returns at two-year lag.
  • Background: +6.35%.
  • Milex decline years: +8.05%.

Military surge effect is indistinguishable from baseline. Military declines actually precede higher commodity returns. The hypothesis does not survive at any tolerance threshold.

Verdict: KILLED. Historical wartime booms were driven by specific supply disruptions (oilfields occupied, minerals in combat zones, shipping interdicted), not generic industrial-mobilisation demand. Peacetime military surges do not reproduce the effect. The 2025-2027 bullish-commodities prediction attached to NATO 2023-2024 surges is not supported by the seventy-four-year record.

Market microstructure Kitchin — spectral transmission PURE NOISE

Hypothesis: The 3.4-year Kitchin inventory cycle propagates through financial microstructure, producing detectable spectral peaks in the 2.5-4.5 year band of credit spreads (BAA-10Y) and VIX.

Why it seemed plausible: The Kitchin cycle is well-documented in inventory data. If inventory drives real activity it should leave a spectral signature in credit and volatility series sensitive to that activity.

What the tests showed:

  • BAA-10Y Lomb-Scargle peak in Kitchin band (2.5-4.5y): 0.079.
  • Juglar band (7-11y): 0.142.
  • Kuznets band (15-25y): 0.224.

The Kitchin-band peak is the smallest of the three. It is also indistinguishable from phase-randomised surrogates (surrogate null 0.085 plus or minus 0.021; observed 0.079 is below the null mean). Both BAA and VIX fail the peak-significance test.

Verdict: KILLED as a spectral-transmission claim. This is importantly different from the Kitchin phase-clock signal, which operates on ISRATIO and maps instantaneous phase to equity-regime forward returns. The phase-clock survives a 5-test battery with held-out z = 8.58 on post-2010 data; the spectral-transmission claim on financial series does not. The mechanism operates through real-economic phase, not credit-spread spectral power.


Updated kill rate

As of April 2026 the Observatory’s trading-signal kill rate after formal battery testing is approximately 44 per cent of tested signals. This is higher than the overall Observatory kill rate of 28 per cent because the battery applies a stricter multi-test standard than the original confirmation process. We consider the higher number a feature. The signals that survive battery testing - VIX term structure, gold-silver ratio, Kitchin phase clock, EBP equity stress, VIX regime - carry correspondingly higher confidence.

The main list above illustrates classical kill modes (mechanism, surrogate, adaptive response). The April 2026 set illustrates a different lesson: a signal passing single-test validation does not imply the signal survives held-out replication, era stability, threshold sensitivity, and specificity against controls simultaneously. The five-test battery is how we separate the two.


April 2026 long-wave battery retest

Extending the battery to long-wave and transmission-chain hypotheses produced two more consequential kills.

Kondratieff 55-year wave in Bank of England 800-year CPI data PURE NOISE

Hypothesis: A ~55-year Kondratieff long wave in inflation and prices shows a statistically significant spectral peak in the Bank of England Millennium-scale CPI series (1270-2016), reflecting a cycle of technological-innovation clusters (steam/textiles, rail/steel, electricity/chemicals, automobiles/petrochemicals, ICT/digital) driving macro regimes.

Why it seemed plausible: The Kondratieff claim has a long pedigree in heterodox economics. The BoE millennium dataset is one of the longest price series available anywhere. Prior analyses reported spectral significance across multiple long-run datasets. Our own earlier validation gave the signal a CONSILIENCE verdict on the basis of cross-dataset replication.

What the tests showed:

  • Surrogate null (phase-randomised, 500 iterations): observed spectral peak in the 40-65 year band = 0.0035. Surrogate null mean = 0.0034. Z-score 0.12. P = 0.36. The observed peak is indistinguishable from random data of equivalent amplitude spectrum.
  • Era split: pre-1900 p = 0.13 (marginal), post-1900 p = 0.67 (no signal).
  • Band sensitivity: only the 45-55 year band p < 0.05; all other candidate bands (50-60, 55-65, 60-70, 40-80) p > 0.15.
  • Alternative-dataset replication: Shiller S&P 500 annual returns show a higher peak than BoE CPI but only one additional series tested.

Verdict: KILLED on surrogate grounds. The previously-reported 55-year cycle in 800-year price history appears to be a phase-randomised surrogate-producible artifact, not a true periodic signal. The earlier CONSILIENCE verdict was overstated because surrogate testing had not been applied at this scale. We are not claiming the cycle cannot exist - only that with the best 800-year dataset available, there is no evidence to distinguish it from noise.

This is an important methodological lesson. A spectral peak observed in a single dataset is not a sufficient basis for the Kondratieff claim. The test that separates a real cycle from a coincidental peak is phase-randomised surrogate comparison, and when we apply it, the 55-year claim does not survive.

Kitchin causal chain: EBP as mediator CONFOUNDED

Hypothesis: The four-step causal chain ISRATIO → EBP → NFCI → VIX mediates the propagation of the Kitchin inventory cycle through to equity volatility. EBP specifically (the Gilchrist-Zakrajsek Excess Bond Premium) captures credit-sentiment shifts before they show up in financial-conditions indices.

Why it seemed plausible: Each pairwise link is statistically significant in VAR(4) on 385 monthly observations. The chain has a coherent economic story.

What the tests showed:

  • Scrambled-EBP null: shuffling the EBP time series breaks the chain decisively (p = 0.0000 for chain preservation). Real signal at the chain level. PASS.
  • Era split (pre-2007 vs post-2010): ISRATIO → EBP link p = 0.044 in train, 0.204 in test. EBP → NFCI link p = 0.17 in train, 0.23 in test. NFCI → VIX link robust in both eras (p < 0.001). The end-point link is solid; the intermediary links are fragile on era splits.
  • Alternative intermediary: replacing EBP with UNRATE (unemployment rate) produced a chain with p = 0.000 - stronger than the EBP-mediated chain at p = 0.033.

Verdict: PARTIAL. The ISRATIO → macro-stress → NFCI → VIX transmission is real. EBP is NOT specifically mediating it - unemployment mediates equally well or better. The claim should be reframed. The forward-bet thesis based on NFCI → VIX (the robust link) is preserved; the EBP-specific-mediator framing is not.


Updated kill/caveat state

The April 2026 long-wave retest adds one full kill (Kondratieff 55-year wave) and one major caveat (EBP not specifically Kitchin-mediating). The total Observatory kill rate rises marginally; more importantly, two widely-cited claims have been reframed.

Taking the trading-signal battery and long-wave battery together, the revised pattern is clear: claims that survive single-test validation often do not survive multi-test battery, and the distinction is decisive for commercial application.


April 2026 ancient-knowledge battery

One more battery worth publishing here, because the result runs against the narrative we had grown comfortable with.

Ancient knowledge as a privileged validation source CONFOUNDED

Hypothesis: Traditional-knowledge systems (Moerman ethnobotany, Ayurveda, TCM, Ifa divination pharmacology, waru waru Andean engineering, Nubian tetracycline, Vedic constitutional frameworks, Mesoamerican calendrical astronomy, and similar ancestral knowledge traditions) selected correct-on-modern-science practices more often than random hypothesis would produce. The claim is that the accumulated pattern-recognition of pre-modern empirical cultures surfaces truths that modern science later confirms, and that this makes traditional knowledge a privileged source of research-worthy hypotheses.

Why it seemed plausible: The Observatory has genuine confirmed cases - Nubian tetracycline (bone labels demonstrate therapeutic levels 1,598 years before Western discovery), Ifa pharmacological hit rate (with publication bias caveat), aboriginal fire management, waru waru, berberine for diabetes in TCM, artemisinin from qinghao, and the phylogenetic convergence finding (Saslis-Lagoudakis 2012 PNAS p<0.001 across seven zero-contact traditions). The narrative that ancient cultures encoded durable truths in their practices appeared to fit the data.

What the tests showed:

  • Twenty-six ancient-source signals in the Observatory.
  • Confirmed rate within the ancient subset: 57.7 per cent.
  • Confirmed rate across all other signals: 81.0 per cent.
  • Difference: minus 23.3 percentage points (z = minus 2.98 against a permutation null).
  • Kill rate in the ancient subset: 34.6 per cent.
  • Kill rate across all other signals: 10.4 per cent.
  • Ancient-source signals are killed at 3.3 times the background rate and confirmed at about two-thirds the background rate.

Verdict: KILLED as a general claim. The position that ancient knowledge serves as a privileged source of research hypotheses is not supported by the Observatory’s data. Ancient-source signals underperform on both confirmation and kill rate relative to the rest of the Observatory.

What survives: The phylogenetic convergence anchor (Saslis-Lagoudakis 2012) remains a real cross-cultural finding at p less than 0.001. Specific named signals that did pass (Nubian tetracycline, aboriginal fire management, berberine-for-diabetes, waru waru agriculture) are real and remain in the corpus. The phenomenon of a few ancient traditions encoding durable empirical truths is not in question. What fails is the inference from “some ancient claims validate” to “ancient knowledge is a privileged source of validated knowledge.”

Revised narrative: Traditional-knowledge sources produce hypotheses that validate at below-average rates by Observatory standards. The residual positives are real and worth studying case-by-case. The framing of ancient wisdom as a reliable corpus of truths deserves to be retired.

This is an uncomfortable kill for a research program that has valorised the frequency-convergence and traditional-knowledge themes. We include it here because that is what the evidence says.


April 2026 cosmic ray chain retest

Neutron monitor flux leading corn prices UNDERPOWERED

Hypothesis: Galactic cosmic ray flux, indexed by neutron monitor counts or the inverted sunspot number, leads corn prices by approximately 24 months. A previously-cited figure of r=+0.475 at the 24-month lag had entered the Observatory forecast scenarios as support for the 2031-2032 agricultural-price prediction.

Why it seemed plausible: The Svensmark cosmic-ray-cloud hypothesis has a coherent mechanism (GCR flux modulates low cloud nucleation, which modulates temperature and precipitation, which modulates crop yields). The 2029-2030 solar minimum is predictable astronomy, so if the chain held, the forward prediction would be crisp.

What the tests showed:

  • 64 years of annual data (1961-2024), inverted SILSO sunspot number as GCR proxy, CBOT corn monthly prices resampled to annual.
  • Permutation null: best observed correlation r=0.11 at 1-year lag. Permutation null mean r=0.16. Observed is below null mean. Z-score minus 0.68.
  • Lag-by-lag: r(0)=0.10, r(1)=0.11, r(2)=0.09, r(3)=0.02, r(4)=-0.09, r(5)=-0.13. No peak near the claimed 24-month lag.
  • Era split: pre-1990 r=0.04, post-1990 r=0.30. The recent era shows a meaningful correlation that the full record does not.
  • Direction check: solar-minimum years do precede higher corn changes two years later (+10.3% vs solar-maximum +2.8%). Direction correct, magnitude small.

Verdict: WEAK, 1/4 PASS. The previously-cited r=+0.475 at 24 months is not supported by the data. The best observed correlation is four times smaller than claimed, at a different lag. Direction is correct, but the magnitude does not support the 2031-2032 agricultural-price prediction at the strength originally attached to it. The post-1990 subsample showing r=0.30 is worth watching as a possible recent-era regime but is not a substitute for the stronger long-run claim.

Revised narrative: The GCR-climate-crop mechanism remains biophysically plausible; the Observatory’s own data provides only weak empirical support for the commodity-price signal at the claimed strength.