Virtual Trading Platform

Experiment settings

Trade settings (N = 10)

Fundamentalists N(F) 0

Trend followers N(T) 0

Risk preferences

33%

34%

33%

Risk-loving · ∈ (−1, 0) 33%

Risk-neutral · = 0 34%

Risk-averse · ∈ (0, 1) 33%

AI endpoint Plan II · LLM + utility forms

Required for Plan II. Every period boundary, each Utility agent calls the LLM for a direct trading action (BUY_NOW, SELL_NOW, BID, ASK_1, HOLD) plus a self-drawn x ∈ [1%, 10%] that scales best_bid / best_ask for the BID and ASK_1 quotes. The structured prompt includes market rules, decision-making principles, role-specific guidance, and the explicit closed-form utility function $U(w; \rho_i) = w^{1-\rho_i}/(1 - \rho_i)$ with the agent's sampled $\rho_i$ substituted in. Fields are read fresh from the DOM on every run (no localStorage); if the key is empty the Start button will refuse to launch.

Provider

API key

Endpoint (optional)

Model

Bounded Rationality · K=3, N=5, T=3, σ=10, p=0.10

AI endpoint Plan III · LLM + risk label only

Required for Plan III. Every period boundary, each Utility agent calls the LLM for a direct trading action. The structured prompt supplies market rules, decision-making principles, and only a natural-language risk-preference label (risk-loving, risk-neutral, or risk-averse) — no closed-form utility function is provided. Fields are read fresh from the DOM on every run (no localStorage); if the key is empty the Start button will refuse to launch.

Provider

API key

Endpoint (optional)

Model

Bounded Rationality · K=3, N=5, T=3, σ=10, p=0.10

Advanced settings population scale · treatment labels

Population 10 agents

Rounds per session 4 rounds

Replacement round r = 4

Session Replacement Rate(%), Pre/Post Asset & FV Correlation —

fundamental weight 1.00

noise 5.0

self (non-peer) weight 0.60

growth 0.15

decay 0.30

anchor 0.50

trend 0.20

dividend 0.20

narrative 0.10

Σβ total 1.00

target = 1.00

Prior Bias Prior Noise

Regulator off

Paper constants Dufwenberg, Lindqvist & Moore (2005), §I. Design

N = 10: Subjects per session; DLM 2005 §I pins the original design at six subjects — “At each session, six subjects participated in a sequence of four consecutive markets for an experimental asset.” This simulator scales the population to N = 100 for a thicker order book while preserving the four-round session structure.§I, p. 1733 · scaled in main.js · switch in Advanced settings
rounds / session = 4: Consecutive markets played by the same subjects; “A session involved four consecutive markets. In the following, we shall talk in terms of four different rounds. Note the distinction between rounds and periods; a round (being a market) consists of ten periods.”§I, p. 1733 · slider in Advanced settings
= 20: Asset life, in periods (per round); “An asset's life span is ten periods.” The original Dufwenberg–Lindqvist–Moore (2005) bubble experiment fixes = 10; this simulator doubles the horizon to = 20 periods for a finer-grained staircase while preserving FV₁ = 100¢ (paired with the halved dividend below).§I, p. 1732 · scaled in main.js
dividend ∈ {0, 10}¢: Per-period draw, equiprobable; “In each period, it pays a dividend of 0 or 20 U.S. cents, with equal probability.” This simulator halves the support to {0, 10}¢ so the per-period expected dividend drops to 5¢ — keeping FV₁ = 100¢ under the doubled = 20.§I, p. 1732 · scaled in assets.js
= 5¢: Expected dividend per period; “The expected dividend in each period is 10 cents (= ½ × 0 cents + ½ × 20 cents).” Under the simulator's halved dividend support this becomes = ½ × 0 + ½ × 10 = 5¢.§I, footnote 5 · scaled in assets.js
: Fundamental value, by backward induction; “With k periods remaining, the fundamental value is k × 10 cents.” Under the simulator's = 5¢ this becomes FV = k × 5¢, so FV₁ = 20 × 5 = 100¢ still holds at the round boundary.§I, p. 1732 · scaled in assets.js
endowment cash ~ U[800, 1200]¢ · inv ~ U{2, 3, 4}: Per-agent starting bundle (simulator replaces the paper's two discrete types); “Before a market opened, half of the traders started with 200 cents and six assets, while each of the other traders started with 600 cents and two assets.” The original Dufwenberg–Lindqvist–Moore (2005) bubble experiment pins two discrete bundles (A = 200¢ + 6 shares, B = 600¢ + 2 shares) with identical 800¢ buy-and-hold value under the risk-neutral fundamental. This simulator replaces the two-type design with an independent per-agent draw — cash uniform on [800, 1200]¢, inventory uniform on {2, 3, 4} shares — sampled in sampleEndowment() (js/agents.js, ENDOWMENT_DEFAULT). Each draw is independent across agents and editable before Start via the Agents panel.§I, p. 1733 · js/agents.js — ENDOWMENT_DEFAULT
round-4 replacement R4-⅔ or R4-⅓: Two-treatment design; “In the fourth round, depending on treatment, two or four experienced subjects who had participated in the first three rounds were randomly selected, removed, and replaced by the same number of inexperienced subjects.” The paper labels these conditions by the fraction of experienced subjects remaining in round 4: R4-⅔ (four veterans + two fresh, shorthand T2) and R4-⅓ (two veterans + four fresh, shorthand T4); the R4-⅔ / R4-⅓ notation appears in the hypothesis row of Table 2.§I, p. 1733; Table 2, p. 1735
sessions = 10: Five per treatment (R4-⅔ and R4-⅓); The multi-session batch runner in the DLM panel reproduces DLM's 10-session design (scaled to N = 100) by sequencing 5 × T20 (R4-⅔) then 5 × T40 (R4-⅓) through the simulator in one click; each session uses a fresh engine seed and a fresh two-type endowment draw.§I, Table 1
payoff Σ final cash + 500¢: Session payoff per subject; “Subjects were privately paid, in cash, the amount of their final cash holdings from each round. They were also paid a show-up fee of $5.” All four rounds count; shares held at the end of a round are worth nothing (the asset’s life span has ended).§I, p. 1735

Hidden Constants

ticks / period = 18: Agent decision rounds inside one period; DLM 2005 runs a continuous 2-minute z-Tree double auction per period; this simulator discretizes that window into 18 decision rounds (≈ one agent turn every 6.7 real-time seconds) so the engine loop can step deterministically. 18 is dense enough to reproduce the bubble-crash pattern while keeping the replay buffer compact.engine.js — period-boundary trigger
naive prior weight = 0.60: Belief blend for naive Utility agents; Weight on the agent's own prior when blending incoming peer messages: $V_i^{\text{post}} = 0.60 \cdot V_i^{\text{prior}} + 0.40 \cdot \bar{m}$, i.e. $w = 0.60$ in the Plan I formula (see Architecture Figure 3). Not specified by DLM 2005, which studies human subjects and has no belief-update model. Chosen so naive agents move noticeably toward peers without collapsing onto them.agents.js — UTILITY_DEFAULTS.naivePriorWeight
skeptical prior weight = 0.90: Belief blend for skeptical Utility agents; Same convex combination as the naive weight but $w = 0.90$: $V_i^{\text{post}} = 0.90 \cdot V_i^{\text{prior}} + 0.10 \cdot \bar{m}$, so a skeptical agent hears messages but is barely moved by them. Not in DLM 2005; introduced so the strategy cube contains a "listen but don't trust" archetype.agents.js — UTILITY_DEFAULTS.skepticalPriorWeight
adaptive weight cap = 0.50: Max one-period belief shift toward peers; Upper bound on the fraction of belief an adaptive agent can shift toward the trust-weighted message mean $\bar{m}$ in a single period: even with fully-trusted senders, $w \geq 0.50$ so $V_i^{\text{post}}$ is at most 50% $\bar{m}$ + 50% $V_i^{\text{prior}}$. Not in DLM 2005; guards against runaway over-update from a single high-trust period.agents.js — UTILITY_DEFAULTS.adaptiveWeightCap
valuation noise = ±3% (legacy): Per-tick uniform noise on the Utility-agent prior; Superseded — now replaced by the v3 §2 Gaussian noise term ~ scaled by the experience-indexed (see the novice valuation noise entry below). The legacy uniform draw $\varepsilon \sim \mathcal{U}[-n, n]$, $n = 0.03$ is still carried on each agent spec for backwards compatibility with older replays but no longer feeds updateBelief. Kept here to document the pre-v3 behaviour.agents.js — UTILITY_DEFAULTS.valuationNoise (inert)
trust λ = 0.30: EMA learning rate for the pairwise trust update; Pairwise trust is updated as $\tau_{r \to s} \leftarrow (1 - \lambda)\,\tau_{r \to s} + \lambda \cdot \text{closeness}$, where closeness $= \max(0,\, 1 - |\hat{v}_s - \mathrm{VWAP}_t| / \mathrm{VWAP}_t)$. $\lambda = 0.30$ weights each new observation at 30%. Not in DLM 2005, which has no messaging layer; chosen for a balance between responsiveness and stability.engine.js — TrustTracker period close-out
passive fill probability = 0.30: $p_{\text{fill}}$ heuristic for scoring non-crossing quotes; Expected-utility score for a passive quote is $\mathrm{EU}(\alpha) = p_{\text{fill}} \cdot U(w_1) + (1 - p_{\text{fill}}) \cdot U(w_0)$ with $p_{\text{fill}} = 0.30$ (see Architecture Figure 2). A full model would estimate $p_{\text{fill}}$ from order-book state; this is a deliberate constant placeholder and is not proposed by DLM 2005.agents.js — UtilityAgent scoring loop
bias magnitude = 15%: Persistent over/under-valuation of biased Utility agents; Applied as $b_i = \delta_i \cdot \beta$ with $\beta = 0.15$; sign set by the per-slot bias direction $\delta_i \in \{-1, 0, +1\}$ (see Architecture Figure 2). Drives the biased U-agent slots in the default strategy cube (U2, U4, U5). Not in DLM 2005; chosen large enough to perturb the market without dominating the risk-preference split.agents.js — UTILITY_DEFAULTS.biasAmount
self (non-peer) weight step Δ_ω = 0.10, k_max = 3: Per-round increment and saturation horizon for ω_i; The anchor has been promoted to Advanced settings → Experience anchors alongside / / / ; only the step size Δ_ω = 0.10 and saturation horizon k_max = 3 stay fixed here so the ramp walks 0.60 → 0.70 → 0.80 → 0.90 for k_i ∈ {0, 1, 2, ≥3} and saturates at 0.90 regardless of the anchor. is the convex weight on the agent's own prior when it blends peer opinion: V^post_i,t = ·V^prior_i,t + (1 − )·m̄_t. Asset-swap blend from round r on: ω_new = |corr|·ω_trained + (1 − |corr|)·.utility.js — ExperienceConfig.omegaStep, omegaKmax · ui.js — UI._blendExperience

Session— / 10

Round1 / 4

Period1 / 10

Tick0

Price—

Fundamental—

Mispricing—

Volume · period0

Agents Pre-run draft · editable before the simulation starts

Note

Cash: experimental-currency balance held by agent i at tick t, used to finance bids and grown by realized sales plus end-of-period dividend receipts. The pre-run editable value is the initial endowment ; in Dufwenberg, Lindqvist & Moore (2005) subjects were seeded with either 200¢ or 600¢, while this simulator draws each slot uniformly from [800, 1200] ¢.
Shares: holding of the finite-life asset at tick t (initial endowment ). Each held share pays a random dividend drawn from {0, 2} at the end of every trading period (DLM 2005), so the theoretical risk-neutral fundamental value at the start of period t is . DLM endowment classes held 6 or 2 shares; this simulator draws from {2, 3, 4}.
Wealth: mark-to-fundamental total wealth, defined as + · , or for Utility agents as + · (Lopez-Lira 2025). The Normalized Agent Utility plot is .
P&L: running change in total wealth relative to the initial endowment , reported in experimental cents. Positive values render in green, losses in red. Aggregated across all agents, P&L equals the cumulative dividends paid so the market is zero-sum up to the dividend stream, as in the Smith–Suchanek–Williams design replicated by DLM.
Subj V: the Utility agent's private subjective valuation per share — the posterior $V_{i,t}^{\text{post}}$ from the active plan (Architecture Figure 3), updated each tick from the v3 §2 prior $V_{i,t}^{\text{prior}} = [\alpha_i\!\cdot\!\widetilde{\mathrm{FV}}_{i,t} + (1-\alpha_i)\!\cdot\!H_{i,t}](1 + b_i) + \varepsilon_i$ via the Plan I/II/III belief-revision protocol. Corresponds to the valuation field in Lopez-Lira's (2025) TradeDecisionSchema.
Report: the valuation the Utility agent broadcasts to peers in its messages. Under communication strategy $\sigma_m = D$ (deceptive), ≠ via the distortion multiplier $\phi_m$ (see Architecture Figure 3, Plan I card); the lie-gap magnitude drives the trust EMA update $\tau_{r \to s} \leftarrow (1 - \lambda)\,\tau + \lambda \cdot \text{closeness}$ and the mean-lie-magnitude statistic in the Experiment Metrics table.
Last action: the most recent decision taken by agent i at tick t, displayed as a coloured tag on the card. In Plan I the agent selects $\alpha^\star_{i,t} = \arg\max_\alpha \mathrm{EU}(\alpha)$ over $\alpha_{i,t} \in \{\text{hold},\, \text{buy@}A_t,\, \text{sell@}B_t,\, \text{bid},\, \text{ask}\}$ scored under the risk-typed utility functional (see Architecture Figure 2).
Subtitle: for classic agents, the strategy class (Fundamentalist, Trend follower, Random ZI, Experienced) together with set membership (, , , ). For Utility agents, the risk preference and the agent's sampled CRRA coefficient in the universal form : Risk-loving draws ∈ (−1, 0) (convex, upside-seeking), Risk-neutral pins = 0 (linear expected value), and Risk-averse draws ∈ (0, 1) (concave, downside-sensitive).

— —

Cash ·

Shares ·

Wealth · & P&L

Subj V · vs

Report V · (lie gap ringed red)

Normalized utility ·

Trade & Dividend Feed

Figure 1

Transaction Price Trajectory versus Risk-Neutral Fundamental Value

fundamental value at the start of period t for the active asset (swaps to match the per-session asset selector)

Tick-level transaction prices (accent line, one dot per executed trade) plotted against the active asset's fundamental-value path (amber dashes). Alternating vertical bands delimit the trading periods of each round. Under the Dufwenberg, Lindqvist & Moore (2005) linear-declining asset a rational market should track the step line exactly; persistent excursions above it are the bubble and the crash toward = in the final period is the collapse. The other five assets (constant perpetuity, linear growth, cyclical, random walk, jump/crash) replace the staircase with their own FV path — the formula above updates accordingly.

Note

: fundamental value at the start of period t (value a rational, risk-neutral holder assigns to one share)
: fundamental value one period ahead — path-based assets (random walk, jump/crash) specify as a function of
: observed transaction price at tick t — drawn as individual trade dots on the chart
,: period indices, 1 ≤ t, s ≤ ; s is the summation index over future periods
: terminal period of the round (default = 20 so that FV₁ = 100 for every asset)
: remaining periods through terminal, = − t + 1 — used by the linear-declining asset
,: expected dividend paid at the end of period t (or s) — the per-share cash flow under the asset's public rule
: mean dividend for the linear-declining asset, = = 5¢ (drawn uniformly from {0, 10¢})
: risk-free discount rate (default 0.05) used by the perpetuity and linear-growth assets
,: intercept and slope in the linear-growth dividend schedule = + ·s (defaults = 2, = 0.3)
: i.i.d. Gaussian innovation driving the random-walk asset, ∼ with = 5
: binary jump magnitude for the jump/crash asset: +2 with probability 0.9 (calm) and −30 with probability 0.1 (crash)
(·, ·): floor operator that clips the path-based assets above a positive minimum so FV never becomes non-positive

Order Book

BIDS

PriceQtyAgent

ASKS

PriceQtyAgent

Figure 2

Signed Mispricing and Price-to-Fundamental Ratio

· signed and relative mispricing

Signed departure of the observed price from the theoretical fundamental value, drawn on a symmetric axis around a dashed zero baseline: positive (premium) fills blue, negative (discount) fills red. In Lopez-Lira (2025) the same information is expressed as the price-to-fundamental ratio : values above one mark an overvaluation regime, values below one mark an undervaluation regime, and ≈ 1 is consistent with rational pricing. The Experiment Metrics panel reports the normalized-deviation and amplitude statistics derived from this series (those metrics retain the absolute-value wrapper because they aggregate magnitudes).

Note

: signed mispricing at tick t, sign preserves premium vs. discount
: price-to-fundamental ratio (Lopez-Lira 2025)

Figure 3

Trade Volume per Period

shares transacted in period t

Sum of share quantities exchanged within each trading period. High and persistent bars indicate active speculation; the classic Smith–Suchanek–Williams bubble is typically associated with a volume peak in the inflation phase followed by a cliff as the asset approaches expiry.

Note

: total share volume traded in period t
: order quantity of a single executed trade

Figure 4

Transaction Density over Price × Period

two-dimensional trade histogram

Two-dimensional histogram of share quantity binned by transaction price (vertical axis) and trading period (horizontal axis). Warm cells concentrate the market's liquidity. Comparing the heat cloud against the downward-sloping fundamental staircase reveals whether the market is trading near rational value or persistently above it.

Note

: cumulative share volume in the (price, period) bin

Figure 5

Agent Action Timeline

per-tick agent decision

buy@A_t bid sell@B_t ask hold executed

One row per agent, one mark per decision. Column colour encodes the five-element action set $\alpha \in \{\text{hold},\,\text{buy@}A_t,\,\text{sell@}B_t,\,\text{bid},\,\text{ask}\}$ used inside $\mathrm{EU}(\alpha)$: the two book-crossing actions (buy@A_t, sell@B_t) render solid, and the two passive posts (bid, ask) render in the softer dashed variant. A small accent dot below the mark records whether the submitted order was filled on the same tick.

Note

: action taken by agent i at tick t

Figure 6

Subjective Valuation: True versus Reported

· lie gap = private belief versus broadcast claim

Solid lines trace each Utility agent's private belief over time. Filled dots mark broadcast messages carrying a reported valuation ; deceptive reports are ringed red and connected to the sender's true belief by a dotted segment — the vertical distance between ring and line is the lie gap. The amber step line is the fundamental value for reference.

Note

: agent i's private (true) subjective valuation at tick t
: valuation reported in a broadcast message
: lie gap for deceptive messages
hover: header Rr_Ss · Pp · t=N names round · session · period · tick; rows summarise the per-tick cross-agent distribution as percentiles P10 / P25 / median / P75 / P90, with FV the fundamental baseline and N the number of agents in that bucket

Figure 7

Normalized Agent Utility over Time

risk-adjusted wealth, normalized to initial endowment

Per-agent expected utility evaluated at the running wealth = + · , divided by the agent's own initial utility so every trajectory starts at 1.0. Lines above the dashed baseline indicate positive risk-adjusted PnL; lines below indicate loss. The risk preference attached to each agent (convex, linear, concave) determines how aggressively a given wealth change is penalised or rewarded.

Note

: universal CRRA utility with per-agent sampled uniformly in (−1, 0) loving · {0} neutral · (0, 1) averse
: mark-to-fundamental wealth at tick t
hover: header Rr_Ss · Pp · t=N names round · session · period · tick; rows summarise the per-tick cross-agent distribution as percentiles P10 / P25 / median / P75 / P90, with FV the fundamental baseline and N the number of agents in that bucket

Figure 8

Asset Ownership over Time

shares held; total supply conserved

Stacked area of each agent's inventory across ticks. Because the double auction conserves shares, the total height is always the aggregate endowment . Widening bands identify agents who are accumulating, shrinking bands identify distributors, and any dramatic redistribution in the last few periods is typically the experienced trader liquidating before the asset expires worthless.

Note

: shares held by agent i at tick t
: total shares outstanding (conserved across time)

Figure 9

Broadcast Message Log

per-tick public broadcast to all other agents

Buy signal Sell signal Hold signal Deceptive

One dot per broadcast message, placed on the sender's row at the tick the message was sent. Dot colour encodes the signal (buy/sell/hold) and a red ring flags messages whose reported valuation diverges sufficiently from the sender's private belief to be classified as deceptive by the logger. Reading a column shows the instantaneous rumour mill; reading a row shows each agent's rhetorical stance over time.

Note

: broadcast from agent i to the population at tick t

Figure 10

Pairwise Trust Matrix

exponential-moving-average update

Heatmap of receiver-to-sender trust values in [0, 1]. The diagonal is masked. Each off-diagonal cell records how well sender s's recent valuation claims aligned with the period's volume-weighted average price, as seen by receiver r. Warm rows identify agents who tend to trust broadly; warm columns identify agents whose claims the population finds credible.

Note

: trust held by receiver r in sender s
: trust learning rate (exponential-moving-average weight)
: 1 − |claim − VWAP| / VWAP, clipped to [0, 1]

Figure 11

Per-Agent Profit & Loss over Time

= − raw cash-equivalent change relative to initial endowment

One line per agent showing running P&L in experimental cents: the agent's mark-to-market wealth = + · minus its initial wealth . Lines above the dashed zero-baseline indicate gains; lines below indicate losses. Unlike Figure 7's risk-adjusted utility, this chart is in raw monetary units so the per-agent coefficient does not enter — every agent is graded on the same cash scale.

Note

: running profit and loss in cents at tick t
: mark-to-market wealth = cash + inventory · subjective valuation
: initial wealth at the start of the agent's first round
hover — header: Rr_Ss · Pp · t=N names, left to right, the round (Rr, 1 … roundsPerSession), the session in the batch (Ss, 1 … 10), the trading period inside that round (Pp, 1 … T), and the global tick index t=N the cursor is snapped to
hover — rows: summarise the distribution of per-agent P&L across all agents at that tick — not a time series, but a cross-section — as five percentiles of = −
P90: 90^th percentile: 10 % of agents are doing better than this number, 90 % worse (the top performers)
P75: upper-quartile boundary of the fan chart's shaded IQR band — top quarter of agents lie above
median: 50^th percentile — the line drawn through the centre of the fan; half the population is above, half below
P25: lower-quartile boundary of the IQR band — bottom quarter of agents lie below
P10: 10^th percentile: 90 % of agents are doing better than this, 10 % worse (the bottom performers)
N: number of agents contributing a P&L sample at this tick; drops when an agent is replaced at the round-3/4 boundary under T20/T40 and the fresh clone has not yet accumulated a data point
value format: signed cents, e.g. +12.4¢ (above initial endowment) or −7.1¢ (below); the dashed horizontal line at 0¢ on the chart is the break-even reference — anything above is net gain, anything below is net loss for that percentile
fan vs. lines: with N > 60 agents the chart renders as a percentile fan (shaded P10–P90 envelope, darker IQR band, solid median) so the hover is the only way to read exact numbers; with N ≤ 60 each agent is drawn as its own line and the same tooltip still reports the cross-sectional percentiles of those lines at the hovered tick

Figure 12

Per-Agent Subjective Valuation over Time

posterior valuation per share, evaluated each tick

One line per agent showing the private subjective valuation the agent assigns to one share at tick t. Each line starts from the agent's prior, then drifts as the active plan's belief-update protocol blends in peer messages, regulator alerts, and (when complex dividends are on) the agent's own dividend sample. The amber dashed line is the risk-neutral fundamental , included as a reference saw-tooth so over- and under-pricing are immediate to read off.

Note

: subjective valuation of agent i at tick t
: risk-neutral fundamental value at tick t
hover: header Rr_Ss · Pp · t=N names round · session · period · tick; rows summarise the per-tick cross-agent distribution as percentiles P10 / P25 / median / P75 / P90, with FV the fundamental baseline and N the number of agents in that bucket

Table 1

Market-Quality Statistics (Current Session)

Quantitative summary in the notation of Dufwenberg, Lindqvist & Moore (2005) and Lopez-Lira (2025). Haessel R² measures fit of the per-period mean price to fundamental value; the two normalized deviations capture total and average mispricing per share outstanding; amplitude is the peak-to-trough excursion of the mean-price residual normalized by the initial fundamental; turnover is the total shares traded divided by shares outstanding. The lower group reports allocative efficiency, aggregate welfare, and the deception statistics unique to the Utility population.

Table 2

10-Session Batch Results

Per-round market-quality metrics across the 10-session DLM batch (5 × first treatment + 5 × second treatment). Each row is labelled Rr_Ss (Round r of Session s). dev = mean absolute deviation |P − FV| in ¢; turn = shares traded / shares outstanding; vol = total shares exchanged; payoff = aggregate agent cash at round end.

Replay & Trace Inspector

Live — tick 0

Term	Expansion	Meaning
Plan I	Algorithmic posterior	Deterministic baseline — $V_{i,t}^{\text{post}} = \omega_i\cdot V_{i,t}^{\text{prior}} + (1-\omega_i)\cdot\bar{m}_t$ with $\omega_i = 0.6 + 0.1\,\min(3, k_i)$ (the v3 §3 self (non-peer) weight, shared across all three plans).
Plan II	LLM posterior · utility form	One chat completion per Utility agent per period. LLM returns a discrete action from {BUY_NOW, SELL_NOW, BID, ASK_1, HOLD} together with a self-drawn $x \in [1\%, 10\%]$; BID posts $\text{best\_bid} \cdot x$ and ASK_1 posts $\text{best\_ask} / x$. Prompt includes the closed-form universal CRRA $U(w; \rho_i) = w^{1-\rho_i}/(1 - \rho_i)$ with the agent's actual sampled $\rho_i$ substituted.
Plan III	LLM posterior · risk label only	Same wiring as Plan II but the prompt only names the risk-preference category; no functional form is supplied. Same seven-action output set.
DLM	Dufwenberg, Lindqvist & Moore (2005)	Source paper for the shared market substrate: $T$, $\mathbb{E}[d]$, $\mathrm{FV}_t$, and the four-round session loop.
U	Utility	EU-maximising agent — the sole agent class ($N = 100$). Per-period belief update is what Plans I, II, and III compare.
FV	Fundamental value	$\mathrm{FV}_t = \mathbb{E}[d] \cdot (T - t + 1)$ — risk-neutral value at the start of period $t$.
EU	Expected utility	$\mathrm{EU}(\alpha) = p_{\text{fill}} \cdot U(w_1) + (1-p_{\text{fill}})\cdot U(w_0)$ — the Utility agent's scoring functional over $\alpha \in \{\text{hold},\, \text{buy@}A_t,\, \text{sell@}B_t,\, \text{bid},\, \text{ask}\}$.
VWAP	Volume-weighted average price	Per-period average trade price weighted by quantity; baseline for the trust EMA update.
ND	Normalized deviation	Total absolute mispricing: $\mathrm{ND} = \sum_j \|p_j - \mathrm{FV}_{t(j)}\| \cdot q_j \,/\, Q$, where $j$ indexes trades, $q_j$ is trade quantity, and $Q$ is total shares outstanding.
R²	Haessel R²	Coefficient of determination of mean price against fundamental value.
TO	Turnover	Total shares traded divided by total shares outstanding — reports speculative intensity.
AE	Allocative efficiency	Realized aggregate valuation divided by the theoretical maximum: $\mathrm{AE} = \sum_i \hat{V}_i q_i \,/\, (\hat{V}_{\max} \cdot Q)$, where $\hat{V}_i = V_i^{\text{post}}$.
Session	10-session DLM batch	One click of Start runs 10 sessions (5 × first treatment + 5 × second treatment). Each session is a complete $R = 4$ round game; data is collected per round with labels $\texttt{R\{r\}\_S\{s\}}$.
Rr_Ss	Round–session label	Identifies Round $r$ of Session $s$ in the batch results table. Example: R3_S7 = round 3 of session 7.
T20 / T40	Treatment sizes (N = 100)	T20 (R4-⅔): 20 agents replaced in R4, 80 veterans remain. T40 (R4-⅓): 40 replaced, 60 veterans remain. First 5 sessions use the selected treatment, last 5 use the other.

Symbol	Definition	Where it appears
$\mathrm{FV}_t$	Fundamental value at the start of period $t$. $\mathrm{FV}_t = \mathbb{E}[d]\cdot(T - t + 1)$, with $\mathbb{E}[d] = \tfrac{1}{2}(0) + \tfrac{1}{2}(10) = 5$¢ and $T = 20$. Yields a staircase from $\mathrm{FV}_1 = 100$¢ to $\mathrm{FV}_{20} = 5$¢, resetting at every round boundary.	Shared substrate — drives every agent's prior (Figures 1–4)
$V_{i,t}^{\text{prior}}$	Agent $i$'s pre-blend valuation at period $t$ — v3 §2 decomposition: $V_{i,t}^{\text{prior}} = \max\!\bigl(0,\,[\alpha_i\!\cdot\!\widetilde{\mathrm{FV}}_{i,t} + (1-\alpha_i)\!\cdot\!H_{i,t}](1 + b_i) + \varepsilon_i\bigr)$. Identical across all three plans; only the posterior update (Step 3) differs.	Prior Formation stage (Figures 1–2)
$\widetilde{\mathrm{FV}}_{i,t}$	Agent $i$'s model-based fundamental value at period $t$ — the asset-specific closed form from v5 §5.{4,10,16,22,28,34}. For the Linear-Declining (DLM) asset this is $\widetilde{\mathrm{FV}}_{i,t} = 5\!\cdot\!(T - t + 1)$, so a rational $\alpha_i = 1$ trader recovers the public $\mathrm{FV}_t$ exactly; every other asset reads its per-asset form off the Figure-2 card.	v3 §2 prior — model-based term (Figure 2)
$H_{i,t}$	Four-term heuristic mix (v3 §4): $H_{i,t} = \beta_1\!\cdot\!\text{Anchor} + \beta_2\!\cdot\!\text{Trend} + \beta_3\!\cdot\!\text{DividendSignal} + \beta_4\!\cdot\!\text{Narrative}$ with default weights $(\beta_1,\beta_2,\beta_3,\beta_4) = (0.50, 0.20, 0.20, 0.10)$ from §6.2 — live-tunable via the green β-row in Advanced settings; the Σβ tile at the end of the row flips amber when the weights no longer sum to 1. Trend drops out at $t = 1$ (no prior period to difference against) per §6.3. The heuristic value enters the prior with weight $1 - \alpha_i$, so novices lean on $H$ while veterans anchor to $\widetilde{\mathrm{FV}}$.	v3 §2 prior — heuristic term (Figure 2)
$\alpha_i$	Per-agent fundamental weight (v3 §2, $\alpha_i \in [0, 1]$): $\alpha_i$ represents the agent's fundamental weight, i.e., the weight placed on the model-based valuation $\widetilde{\mathrm{FV}}_{i,t}$ in the prior, with the complement $1 - \alpha_i$ going to the heuristic reading $H_{i,t}$. Experience raises it via the §3.2 / §6.1 rule $\alpha_i = \min\{1,\, 0.4 + 0.15\, k_i\}$, so $\alpha_0 = 0.40$ is the novice intercept and $\gamma_\alpha = 0.15$ is the per-round slope — both tunable via the pink experience row in Advanced settings. Saturates at $1.00$ once $k_i \geq 4$. Blended toward $\alpha_0$ by $(1 - \|\mathrm{corr}\|)$ when the asset swaps post-replacement. Distinct from the paper's $\text{Anchor}$ term (first primitive of $H_{i,t}$). Rendered on agent cards as "Fundamental weight".	v3 §2 prior — $\widetilde{\mathrm{FV}}$/$H$ mixing weight
$\sigma_i$	Per-agent valuation-noise scale (v3 §3): $\sigma_i = \sigma_0\!\cdot\!\exp(-\gamma_\sigma\!\cdot\!k_i)$ with $\sigma_0 = 15$¢ and decay rate $\gamma_\sigma = 0.30$ (both tunable via the pink experience row in Advanced settings). Sets the standard deviation of the Gaussian $\varepsilon_i \sim \mathcal{N}(0, \sigma_i^2)$ added to the prior. Novices ($k_i = 0$) have $\sigma_0 = 15$¢; a three-round veteran has $\sigma_3 \approx 6.1$¢. Blended toward $\sigma_0$ post-asset-swap. Rendered on agent cards as "Valuation noise".	v3 §2 prior — Gaussian jitter scale
$\omega_i$	Per-agent self (non-peer) weight (v3 §3): $\omega_i = \omega_0 + \Delta_\omega\!\cdot\!\min(k_\omega, k_i)$ with $\omega_0 = 0.60$ (tunable via the pink experience row in Advanced settings), $\Delta_\omega = 0.10$, saturation horizon $k_\omega = 3$. So $\omega_i \in \{0.60, 0.70, 0.80, 0.90\}$ for $k_i \in \{0, 1, 2, \geq 3\}$ at the default anchor. Controls the Step-3 peer blend $V^{\text{post}} = \omega_i\!\cdot\!V^{\text{prior}} + (1-\omega_i)\!\cdot\!\bar{m}$ — $\omega_i$ is the weight on the agent's own prior, $1 - \omega_i$ is the weight on the peer-message mean. Blended toward $\omega_0$ post-asset-swap. Rendered on agent cards as "Self (non-peer) weight".	Plan I posterior; Plans II/III Step-3 blend
$\varepsilon_i \sim \mathcal{N}(0, \sigma_i^2)$	Per-tick Gaussian valuation noise drawn via Box–Muller over the seeded PRNG. The per-agent $\sigma_i$ shrinks exponentially in $k_i$, so novices have noisy priors and veterans sharpen. Gated by Advanced → Prior Noise; when the toggle is OFF, $\varepsilon_i = 0$.	v3 §2 prior — noise term (Figure 2)
$b_i = \delta_i \cdot \beta$	Persistent per-agent valuation bias. $\delta_i \in \{-1, 0, +1\}$ is the bias direction drawn at birth (pessimistic, unbiased, optimistic) and $\beta = 0.15$ is the bias magnitude. Applied multiplicatively on the $\alpha$-weighted $\widetilde{\mathrm{FV}}/H$ blend inside the v3 §2 prior. Gated by Advanced → Prior Bias.	Prior formation (Figure 2)
$k_i$	Agent $i$'s experience counter (v3 §3). Starts at $0$; incremented by $1$ at every round boundary for every surviving agent. Drives the triple $(\alpha_i, \sigma_i, \omega_i)$ — so $k_i$ controls fundamental weight, noise amplitude, and self (non-peer) weight simultaneously. Fresh R4 replacements restart at $k_i = 0$.	Experience-indexed modelling parameters (Figures 2–3)
$\|\mathrm{corr}\|$	Asset-swap experience-transfer weight. When a session pairs different pre- and post-assets, $\mathrm{corr}$ is the Pearson correlation between the two assets' expected $\mathrm{FV}$ paths (sampled from a single seeded pre-round simulation; flat-path pairings coerce to $0$). From round 4 onward the experienced triple is blended toward the novice anchors as $x_{\text{new}} = \|\mathrm{corr}\|\!\cdot\!x_{\text{trained}} + (1 - \|\mathrm{corr}\|)\!\cdot\!x_0$ for $x \in \{\alpha, \sigma, \omega\}$. $\|\mathrm{corr}\| = 1$ preserves training; $\|\mathrm{corr}\| = 0$ resets to novice anchors.	Session-level asset-swap experience blend
$\hat{v}_m$	Claimed valuation reported by peer agent $m$. Computed as $\hat{v}_m = \max(0,\, V_m \cdot \phi_m)$ where $\phi_m$ is a distortion multiplier determined by $m$'s communication strategy $\sigma_m \in \{H, B, D\}$ (see Figure 3). The peer-message mean is $\bar{m} = \tfrac{1}{\|M\|}\sum_{m \in M} \hat{v}_m$ where $M$ is the set of non-self messages received this period.	Plan I posterior — blended with prior via weight $w$ (Figure 3)
$\sigma_m \in \{H, B, D\}$	Communication strategy of agent $m$: $H$ = truthful (small uniform jitter), $B$ = biased (fixed-sign tilt), $D$ = strategic (inventory-dependent over/understatement). Assigned at birth and persistent across rounds.	Distortion multiplier $\phi_m$ in $\hat{v}_m$ (Figure 3)
$\phi_m$	Communication distortion multiplier. $\phi_m = 1 + \mathcal{U}[-h, h]$ if $\sigma_m = H$; $\phi_m = 1 + \delta_m \gamma$ if $\sigma_m = B$; $\phi_m = \kappa^+$ or $\kappa^-$ if $\sigma_m = D$ (depending on $q_m$ vs $q_m^0$), with a $1 + \mathcal{U}[-\gamma, \gamma]$ fallback at $q_m = q_m^0$. Parameters: $h = 0.01$, $\gamma = 0.10$, $\kappa^+ = 1.18$, $\kappa^- = 0.82$.	$\hat{v}_m = \max(0,\, V_m \cdot \phi_m)$ (Figure 3)
$V_{i,t}^{\text{post}}$	Agent $i$'s period-end valuation — output of the v3 §3 Step-3 peer blend: $V_{i,t}^{\text{post}} = \omega_i\!\cdot\!V_{i,t}^{\text{prior}} + (1 - \omega_i)\!\cdot\!\bar{m}_t$, or $V_{i,t}^{\text{prior}}$ when no foreign messages arrived this period. Plan I computes this blend directly; Plans II/III cache an LLM-delivered posterior that short-circuits the blend when available and falls back to the same formula otherwise.	Becomes next period's prior in all three plans (Figure 3)
$U(w; \rho_i)$	Universal CRRA utility shared by every agent: $U(w;\rho) = w^{1-\rho}/(1-\rho)$, evaluated in normalized form $(w/w_0)^{1-\rho}$ so $U(w_0) = 1$. The per-agent coefficient $\rho_i$ is drawn uniformly from $(-1, 0)$ (risk-loving, strictly convex), pinned at $0$ (risk-neutral, linear), or drawn uniformly from $(0, 1)$ (risk-averse, strictly concave).	EU scoring; the substituted $\rho_i$ appears explicitly in Plan II prompts (Figures 2–3)
$w_0, w_1$	Wealth states for EU evaluation. $w_0 = c_i + q_i \cdot \hat{V}_i$ (wealth if no trade); $w_1 = (c_i \pm p_{\text{order}}) + (q_i \pm 1) \cdot \hat{V}_i$ (wealth if the order fills at price $p_{\text{order}}$), where $c_i$ is cash, $q_i$ is inventory, and $\hat{V}_i \equiv V_i^{\text{post}}$ is the agent's subjective valuation.	EU scoring — $\mathrm{EU}(\alpha) = p_{\text{fill}} \cdot U(w_1) + (1 - p_{\text{fill}}) \cdot U(w_0)$ (Figure 2)
$p_{\text{fill}} = 0.30$	Assumed fill probability for a non-crossing (passive) quote. Used in the EU functional: $\mathrm{EU}(\alpha) = p_{\text{fill}} \cdot U(w_1) + (1 - p_{\text{fill}}) \cdot U(w_0)$. For crossing actions (buy@$A_t$, sell@$B_t$), $p_{\text{fill}} = 1$ (deterministic); for passive actions (bid, ask), $p_{\text{fill}} = 0.30$ (tunable).	EU scoring — $\alpha^\star_{i,t}$ action evaluation (Figure 2)
$\alpha^\star_{i,t}$	Optimal action for agent $i$ at tick $t$. $\alpha^\star_{i,t} = \arg\max_\alpha \mathrm{EU}(\alpha)$ over the five-element set $\alpha \in \{\text{hold},\, \text{buy@}A_t,\, \text{sell@}B_t,\, \text{bid},\, \text{ask}\}$, where $A_t$ is the current best ask and $B_t$ is the current best bid. buy@$A_t$ crosses the book at the resting ask (deterministic fill, $p_{\text{fill}} = 1$); sell@$B_t$ lifts the resting bid (deterministic fill); bid and ask post passive quotes ($p_{\text{fill}} = 0.30$). Plans II/III use a seven-element LLM action set: $\{\text{BUY\_NOW, SELL\_NOW, BID\_1, BID\_3, ASK\_1, ASK\_3, HOLD}\}$.	Action selection — output of EU maximization (Figures 2–3)
$\tau_{r \to s}$	Trust of receiver $r$ in sender $s$. Updated by exponential moving average: $\tau_{r \to s} \leftarrow (1 - \lambda)\,\tau_{r \to s} + \lambda \cdot \text{closeness}_{r,s}$, where $\lambda = 0.30$ is the EMA learning rate and $\text{closeness} = \max\!\bigl(0,\, 1 - \|\hat{v}_s - \text{VWAP}_t\|\,/\,\text{VWAP}_t\bigr)$. Initialized at $0.5$; self-trust fixed at $1.0$.	Messaging diagnostic; context for Plan II/III prompts (Figure 3)
$\pi_i^{\text{II}}, \pi_i^{\text{III}}$	Structured LLM prompts for Plans II and III. $\pi^{\text{II}}$ includes market rules, agent state, and the explicit universal CRRA formula $U(w; \rho_i) = w^{1-\rho_i}/(1 - \rho_i)$ with the agent's sampled $\rho_i$. $\pi^{\text{III}}$ omits the formula and supplies only the risk-preference label.	LLM posterior — input to $\alpha^\star_{i,t} \leftarrow \text{LLM}(\pi_i)$ (Figure 3)
$Q$	Total shares outstanding, $Q = \sum_i q_i$, conserved under double-auction trades (shares transfer, never created or destroyed).	Normalized deviation $\mathrm{ND}$, turnover $\mathrm{TO}$ (Figure 4)
$\bar{p}_t$	Mean trade price in global period $t$. $\bar{p}_t = \sum_{j \in \mathcal{T}_t} p_j \,/\, \|\mathcal{T}_t\|$ where $\mathcal{T}_t$ is the set of trades in period $t$. Used as the basis for Haessel $R^2$ and amplitude.	Market-quality diagnostics (Figure 4, Table 1)
$R^2_{\text{Haessel}}$	Haessel (1978) coefficient of determination. $R^2 = 1 - \sum_t (\bar{p}_t - \mathrm{FV}_t)^2 \,/\, \sum_t (\bar{p}_t - \overline{\bar{p}})^2$. Measures how closely per-period mean prices fit the fundamental staircase; can be negative if mispricing exceeds sample variance.	Market-quality diagnostics (Figure 4, Table 1)
$\mathrm{ND}$	Normalized absolute price deviation. $\mathrm{ND} = \sum_j \|p_j - \mathrm{FV}_{t(j)}\| \cdot q_j \,/\, Q$, summing over all trades $j$ weighted by quantity, divided by total shares outstanding.	Market-quality diagnostics (Figure 4, Table 1)
$A$	Price amplitude. $A = \bigl(\max_t (\bar{p}_t - \mathrm{FV}_t) - \min_t (\bar{p}_t - \mathrm{FV}_t)\bigr) \,/\, \mathrm{FV}_1$. Peak-to-trough excursion of the mean-price residual, normalized by the initial fundamental.	Market-quality diagnostics (Figure 4, Table 1)
$\mathrm{TO}$	Turnover. $\mathrm{TO} = \sum_j q_j \,/\, Q$ — total shares traded (summing quantity $q_j$ over all trades $j$) divided by total shares outstanding $Q$. A value of $1.0$ means every share changed hands once.	Market-quality diagnostics (Figure 4, Table 1)
$\rho_t$	Price-to-fundamental ratio. $\rho_t = p_t \,/\, \mathrm{FV}_t$ (Lopez-Lira 2025), where $p_t$ is the most recent trade price at tick $t$. Values $> 1$ indicate overpricing; persistent $\rho_t \gg 1$ signals a bubble.	Market-quality diagnostics (Table 1)

Tag	Citation	Role in this simulator
DLM 2005	Dufwenberg, Lindqvist & Moore, Bubbles and Experience: An Experiment, AER 95(5), 1731–1737	Market substrate — asset life, dividend shape, $\mathrm{FV}_t$, session loop
LL 2025	Lopez-Lira, AI-Agent Expected-Utility Market Makers (working paper)	Utility agent, EU scoring, risk functionals, trust EMA
SSW 1988	Smith, Suchanek & Williams, Bubbles, Crashes and Endogenous Expectations in Experimental Spot Asset Markets, Econometrica 56(5)	Canonical experimental-bubble design; the asset-life and dividend structure that DLM 2005 inherits

Experiment settings

Agents Pre-run draft · editable before the simulation starts

Note

Trade & Dividend Feed

Note

Order Book

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Note

Replay & Trace Inspector

Decisions recorded at this tick

System Design

Fundamental Value

Prior Elicitation — v3 §2 decomposition

Experience Factors $\alpha_i, \sigma_i, \omega_i$ and the |corr| asset-swap blend

Expected-Utility Scoring

Advanced Settings — Prior Toggles

Plan I — Algorithmic Posterior

Broadcast Message — Reported Valuation

Pairwise Trust Dynamics

Plan II — LLM Posterior with Utility Form

Plan III — LLM Posterior with Risk Label

Regulator Warning — Bubble-Ratio Prompt Injection

Mispricing Measures

Volume and Efficiency Measures

System Prompt · $\pi^{\mathrm{II}}_{\text{sys}}$

User Prompt · $\pi^{\mathrm{II}}_{\text{usr}}(\text{agent}_2)$ · asset = Linear Declining (DLM)

【Asset Environment】 · per-asset booklet

How experience and the asset selector change the prompt

Bounded-Rationality Addendum · appended to $\pi^{\mathrm{II}}_{\text{sys}}$ / $\pi^{\mathrm{III}}_{\text{sys}}$

Glossary & Reference

Abbreviations & indices

Mathematical notation

Figures

Transaction Price Trajectory vs Fundamental Value

Signed Mispricing

Trade Volume per Period

Transaction Density Heatmap

Agent Action Timeline

Subjective Valuation · Per agent

Pairwise Trust Matrix

Market-Quality Statistics (Table 1)

10-Session Batch Results (Table 2)

Source papers

AI-Agent Prior Elicitation in Experimental Asset Markets

Motivation

Literature & positioning

Experimental asset markets

LLMs as economic agents

Gap

Research questions

Key idea · three-plan factorial

Plan I · Algorithm

Plan II · LLM + Form

Plan III · LLM + Label

Market substrate · DLM (2005)

Round boundary protocol

Session payoff & batch structure

Agent design · $N = 100$ Utility agents

U · Utility agent (sole agent class)

Risk composition

Strategy cube

Endogenous experience

Expected-utility framework

Risk-loving

Risk-neutral

Risk-averse

Shared prior & endogenous experience — v3 §2/§3

Plan I · algorithmic belief update

Novice · $k_i=0$

Intermediate · $k_i \in \{1,2\}$

Veteran · $k_i \geq 3$