Screwcap Games · Research Note · v1.0
Earned Conviction
A decision-science framework for sequenced, experience-first training in commodities prediction markets.
Most retail traders form opinions about markets long before they ever test those opinions against consequence. Gold Digger is built to invert that order. This note sets out the reasoning behind the product's central design choice — that a player should not begin in the live present, but should first earn a position through structured, experience-first training. We ground the choice in three literatures: the description–experience gap in risky choice, the distinction between kind and wicked learning environments, and John Boyd's OODA loop as a model of decision-making under time pressure. We argue that naïve historical replay can quietly reproduce the very pathology it aims to cure, and we specify the three design safeguards — anonymisation, calibration-based scoring, and honest framing of what a simulation cannot teach — that convert a wicked domain into a kind learning environment.
§1 The problem of the unearned position
A newcomer dropped into a live market does not arrive empty-handed. They arrive with opinions — assembled from headlines, forum sentiment, and the confident voice of whichever commentator they last heard. What they lack is any record of whether those opinions survive contact with outcomes. The premise of Gold Digger is that this gap between holding a view and having tested it is the single most expensive thing in a trader's early career, and that it can be closed cheaply, before real capital is ever at risk.
The intuition we set out to formalise is simple: starting in the present feels natural and agreeable, but it is not an earned position. The familiarity of today's headlines flatters the player into believing they understand a situation they have merely recognised. Learning to read the past, and then learning to act in the present, produces a more durable kind of skill than either alone. The literatures below explain why that intuition is right — and, just as importantly, where it goes wrong if implemented carelessly.
§2 Description versus experience
Decision researchers have long known that people make systematically different choices about risk depending on how they came to know the odds. When probabilities are handed over as descriptions — a number, a forecast, a brochure — people tend to overweight rare events. When the same probabilities must instead be learned by living through outcomes one at a time, people tend to underweight those rare events. The foundational demonstration is Hertwig, Barron, Weber & Erev (2004), with the broader synthesis in Hertwig & Erev (2009). The phenomenon is robust enough to be called, plainly, the description–experience gap.
A player reasoning from today's headlines is making a decision from description. A player who has traded through a regime has made a decision from experience. The two produce different traders.
This is the academic skeleton of the "unearned position." The new player in the live present is, definitionally, deciding from description. The player who has sat inside a market regime — felt a thesis form, watched it tested, taken the loss when it failed — has converted that same information into experience. Designing the onboarding so that a player's first encounters with risk are experiential is therefore not a stylistic preference; it changes the cognitive representation the player builds of how markets behave.
§3 Kind and wicked learning environments
Experience, however, is not automatically educational. Whether practice produces genuine skill or merely confident error depends on the structure of the feedback. Robin Hogarth's distinction between kind and wicked learning environments (Hogarth, 2001; Hogarth, Lejarraga & Soyer, 2015) is the pivotal idea. In a kind environment, feedback is accurate, plentiful, and tightly linked to the action that produced it — chess is the canonical example. In a wicked environment, the rules drift, patterns do not cleanly repeat, and feedback may be delayed, noisy, or actively misleading.
Markets are the textbook wicked environment, which is precisely why experience in them so often teaches the wrong lesson. Hogarth's favourite illustration is a celebrated physician who could "diagnose" typhoid by palpating patients' tongues — and was simply spreading it with his hands, his run of confirmations teaching him the worst possible lesson. Kahneman & Klein (2009) reach the same conclusion from the opposite tradition: the quality of an intuitive judgment depends on the predictability of the environment and the learner's opportunity to grasp its regularities. Give someone repetitions in a wicked environment and you do not manufacture an expert; you manufacture someone who is wrong with conviction.
Naïve historical replay can look kind while staying wicked. If the player recognises the period, they recall an answer rather than make a call. A single replayed path teaches one realised outcome, not the distribution of outcomes that could have occurred. And paper money never bleeds, so the emotional half of the lesson is absent. Built carelessly, history-first onboarding becomes a confidence factory.
§4 Manufacturing a kind environment
The encouraging finding is that kindness can be engineered. Hogarth & Soyer (2011) showed that letting people experience sequentially simulated outcomes converts an opaque, easily-misread description into kind experience — which is close to a literal specification for the mechanic Gold Digger is built around. Their later book, The Myth of Experience (2020), supplies the cautionary bookend: unstructured experience misleads at least as often as it instructs.
Our design therefore treats three safeguards as non-negotiable:
1 · Anonymise the scenario
During play, the historical period's identity — its date and, where feasible, its instrument labels — is withheld and revealed only after the verdict. A player who knows they are inside the 2008 unwind is testing memory, not judgment. Revealing the market one observation at a time is the same device that makes professional bar-replay tools effective: it removes hindsight by letting the past unfold as if it were live.
2 · Score calibration, not luck
Because a wicked single-outcome path can reward bad decisions and punish good ones, the headline metric is not profit-and-loss but calibration: did the player's stated confidence match their realised hit-rate? This is operationalised with a Brier-style score (Brier, 1950), the standard measure for probabilistic forecasts. Rewarding well-justified decisions rather than lucky ones is exactly how a wicked domain is made kind.
3 · Teach the distribution, not the path
Where possible, the same setup is presented with more than one plausible continuation, and every debrief frames the result as "what happened, and what could reasonably have happened." It is the cheapest available insurance against hindsight-fuelled overconfidence.
§5 The OODA loop as the interaction model
The moment-to-moment loop the player runs is borrowed, deliberately, from military aviation. The OODA loop — Observe, Orient, Decide, Act — was developed by USAF Colonel John Boyd to describe how a pilot prevails not through superior hardware but through cycling decisions faster and more coherently than an adversary (Boyd, 1996; for the authoritative treatment, Osinga, 2007). Crucially, Boyd's own model is not a tidy four-box circle: Orientation is the load-bearing step, the point at which observations are weighed against everything the decision-maker already knows. Without orientation, data means nothing.
The first scenario is paced for orientation, not speed. You watch the market move before you are asked to move with it.
This shapes the onboarding directly. A player's first session is a synthetic, fictional scenario — no real history to recognise, no capital to lose — paced slowly so the loop can be learned one stage at a time. They observe the feed and the price action; they orient with scaffolding that explains how a move in the dollar or in real yields bears on the metal; they decide on a direction and a confidence; they act, and the clock runs. Then the loop repeats, tightening with each pass as the scaffolding falls away and the verdict window shrinks. Only once the loop is internalised does the player reach the fork: historical rounds (anonymised) or the live present.
§6 The training ladder
The progression — synthetic → anonymised historical → live present — mirrors how serious practitioners actually train, and why. Deliberate practice theory (Ericsson, Krampe & Tesch-Römer, 1993) holds that skill is built through focused repetition at the edge of current ability with immediate feedback; historical replay is uniquely suited to this because it compresses reps, letting a player run dozens of regime-snippets in a sitting rather than waiting months for them to occur. We note the literature is contested — a large replication found the effect of practice real but smaller than originally claimed (Macnamara & Maitra, 2019) — which is itself a reason to pair reps with the calibration scoring above rather than rewarding volume alone.
The final rung is treated with deliberate honesty. Graduating to the present does not merely add recency; it adds consequence — the one ingredient a simulation structurally cannot supply. We say so plainly, because a training ladder that pretends the sim is the real thing teaches its own unearned confidence.
§7 Measurement & the public record
Every call a player makes is logged immutably — no edits, no quiet deletions — because the integrity of the record is the product. From that record we surface accuracy by market and by regime, confidence calibration over time, and the conditions under which a player's judgment holds or breaks. The leaderboard built on these records is deliberately public and free, not gated behind the subscription. A visible, competitive, shareable record is the engine of return play and word-of-mouth; the paid tier instead unlocks analytical depth — the laboratory, not the scoreboard. Gating the social hook would suppress the very behaviour the product depends on.
Design principles, condensed
| Principle | Why |
|---|---|
| Experience before description | Closes the description–experience gap; builds an experiential model of risk. |
| Anonymise history | Removes hindsight so the player makes a call, not a recollection. |
| Calibration over P&L | Makes a wicked single-path domain kind by rewarding good decisions, not luck. |
| Orientation-first pacing | Honours Boyd's schwerpunkt; the first loop is slow on purpose. |
| Honest graduation | The present adds consequence the sim cannot; saying so protects against false confidence. |
| Free public leaderboard | The record is the product; the social hook drives retention and reach. |
§8 Limitations & commitments
We are candid about the ceiling of this approach. A simulation cannot reproduce slippage, real liquidity, fees, or — most of all — the behaviour-altering pressure of real money, and a strong record inside the trainer is not evidence of a live edge. Gold Digger is an educational, simulated environment; it is not financial advice and is not a brokerage. Market data shown in the live mode is delayed. Where the trainer later links out to real platforms, those links exist to let a trained player take an earned next step, not to manufacture one. The point of the whole apparatus is to make a player's conviction earned — and to be honest about exactly how far that earning goes.
§9 Data & community resources
The trainer draws on public data and public discourse. Sources are listed here in full and linked directly, in the spirit of attribution.
Market & macro data
Community intelligence
From the Screwcap Games studio
§ References
- Boyd, J. R. (1996). The Essence of Winning and Losing [briefing].
- Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78(1), 1–3.
- Ericsson, K. A., Krampe, R. T., & Tesch-Römer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406.
- Hertwig, R., Barron, G., Weber, E. U., & Erev, I. (2004). Decisions from experience and the effect of rare events in risky choice. Psychological Science, 15(8), 534–539.
- Hertwig, R., & Erev, I. (2009). The description–experience gap in risky choice. Trends in Cognitive Sciences, 13(12), 517–523.
- Hogarth, R. M. (2001). Educating Intuition. University of Chicago Press.
- Hogarth, R. M., & Soyer, E. (2011). Sequentially simulated outcomes: Kind experience versus nontransparent description. Journal of Experimental Psychology: General, 140(3), 434–463.
- Hogarth, R. M., Lejarraga, T., & Soyer, E. (2015). The two settings of kind and wicked learning environments. Current Directions in Psychological Science, 24(5), 379–385.
- Kahneman, D., & Klein, G. (2009). Conditions for intuitive expertise: A failure to disagree. American Psychologist, 64(6), 515–526.
- Macnamara, B. N., & Maitra, M. (2019). The role of deliberate practice in expert performance: revisiting Ericsson, Krampe & Tesch-Römer (1993). Royal Society Open Science, 6(8), 190327.
- Osinga, F. P. B. (2007). Science, Strategy and War: The Strategic Theory of John Boyd. Routledge.
- Soyer, E., & Hogarth, R. M. (2020). The Myth of Experience. PublicAffairs.