Timeline

Chronological record of research activity — papers reviewed, forecasts published, projects completed. Updated as new work is done.

Paper
Forecast
Essay
Note
Hypothesis
Project

May 2026

C++20 engine with lock-free MPSC ring buffer, HMM 3-state regime classifier, GARCH(1,1) vol-scaled sizing. Built and completed.

Neural network for market regime classification across Bear/Sideways/Bull states. Used to gate signal execution within TQC.

Initial estimate. Based on reference class forecasting and current scaling trajectory.

Initial estimate. EU AI Act in force; US legislative path remains the critical uncertain variable.

Initial estimate. Causal validation at frontier scale remains out of reach; research trajectory is promising.

Initial estimate. Stargate infrastructure implies capability; energy and construction constraints are the binding variables.

Initial estimate. Primary pathways: critical infrastructure cyberattack and autonomous weapons in active conflict zones.

Universality claim is the most consequential bet in mechanistic interpretability. Evidence so far supports it as more than a convenient assumption.

Capability phase transitions may be traceable to specific structural thresholds — forecastable from mechanistic analysis rather than only discoverable post-hoc.

Residual stream framing converts opaque forward pass into compositional operations. Key enabling condition for meaningful third-party auditing.

HMM belief state geometry in the residual stream directly connects to TQC regime classification work. Raises the question of what model belief-state mis-specification looks like.

The framework's real contribution is methodological — formalising how to update timeline forecasts as compute costs fall and model efficiency improves.

Apparent discontinuity may be an artefact of evaluation metric design. If Schaeffer et al. are correct, governance frameworks calibrated for capability cliffs are the wrong design target.

Framework needs updating for post-GPT-4 environment. The politics and governance of AI have largely collapsed into a single problem.

Third-party auditing is the correct structural intervention but the audit methodology is critically underdeveloped. Interpretability research is the enabling condition.

Specification gaming is not a pathology of poorly designed systems — it is the default behaviour of optimisers operating near the boundary of their specification.

CIRL is correct as a theoretical frame and insufficient as a governance blueprint. The gap between those two assessments is where the most interesting work is happening.

Governance frameworks that rely on benchmark-based compliance verification are structurally inadequate. Alignment cannot be verified on a fixed evaluation set.

Models the co-evolution of capability growth and deployment incentives. The governance lag it describes is the central variable that policy researchers should be trying to compress.

Research began · 2025