What got built
Two new paper-trading bots for your dad to watch — one per index — bolted onto the existing stock-bot without disturbing it.
The existing system already runs a live equity paper-trader (a long-running systemd process) and a set of futures advisory crons that email your dad Mandarin picks. The job was to add two more traders — nasdaq100 and sp500 — each a separate $1,000,000 paper account running at maximum aggression: long and short, up to ~2× Reg-T margin, many concurrent positions, and its drawdown halt deliberately disabled — the daily/weekly loss limits are set to 100%, so it never auto-stops. They’re meant to be watched blowing up and re-funded — a measured experiment, on paper money only.
Everything is done and sitting in a draft pull request. Nothing is deployed — that last step is yours (and it can’t be automated; see the handoff below).
The existing ML screener
Same HistGradientBoosting model & feature schema — just scoped to each index’s own constituents.
Per-account everything
Own data namespace, own config, own Alpaca account, own advisor cron, own safety nets.
A 3-tab dashboard
equity / nasdaq100 / sp500, with a persistent English ⇄ 中文 toggle for your dad.
The bots that work
The running equity & futures bots are byte-for-byte unchanged — proven, not asserted.
Shared-nothing isolation
The thing that makes aggression safe isn’t a kill-switch — it’s that a crater on one account physically cannot reach the others.
The design constraint: three isolated instances of the same code, never one process juggling three accounts. They share no mutable state, no buying power, and no broker account — so a blow-up is contained by construction.
The whole build hinged on one seam. The original code hard-wired every data path to a single memory/ directory. We introduced memory_dir(account) — with no account, it returns the byte-identical old path, so the live bots see zero change; with an account, it nests under memory/accounts/<id>/. That single default is load-bearing: get it wrong and you silently corrupt the running trader’s state with no error. It ships with a regression test proving the no-account path matches the pre-refactor path for every writer.
Seven phases, gated
Recon first, then the risky refactor solo and dual-reviewed, then parallel fan-out for the rest — every phase verified against real tool output before the next.
read-only
Recon
Six parallel agents mapped the codebase and fact-checked the brief against the actual code. They caught a whole class of blast-radius the brief missed — and 5 stale “facts” that were wrong.
+ dual review
Isolation layer
The memory_dir(account) seam — the crown jewel. Done alone (never parallel-edited), then reviewed by me and an independent agent. 18 new regression tests.
Universe scoping
Each bot sees only its index’s constituents (plus its own ETF) — true index purity, no bleed. Reused the existing committed constituent lists.
Broker go-live gate
A read-only preflight that reads each account’s real equity / margin / shorting from Alpaca and refuses & alerts if it isn’t the $1M / 2× / short-enabled it should be. Tested to never place an order.
×2
Configs, templates & safety nets
Per-account aggression configs, systemd/cron templates (not installed), the capital top-up helper, and per-account STOP-flag + fire-counter + freshness parity + a healthcheck equity-floor gate.
Dashboard
3-tab page + a client-side English ⇄ 中文 toggle that persists across the 30-minute re-render. Native Mandarin, ~148-key dictionary, parity checked at build.
Verify & ship
Full suite green, whole-branch Opus ship-gate review, merge-gate audit, draft PR opened. Options trading was explicitly deferred to its own future phase.
How the agents caught the bugs
Parallelism gave speed; the review gates gave the confidence. Two real defects slipped past “green” reports — and were caught anyway.
The build fanned out across waves of subagents — a 6-agent recon, then the isolation refactor alone, then two waves of parallel implementers, each committing only its own disjoint files. But three green “done” reports don’t mean the tree is green together. Running the full suite once, serialized, on the merged result is what surfaced the truth.
Caught & fixed
The e2e regression. A universe-scoping change quietly broke a sibling test in the advisor pipeline — invisible to that agent’s own targeted test run. The serialized full-suite gate found it; a one-commit fix aligned the test with the new path threading, no assertions weakened.
Caught & fixed
The poisoning risk. An account’s advisor could have written picks into the shared equity log if its AI turn ever forgot a flag — scoping enforced only by prose. Fix: export STOCKBOT_ACCOUNT so scoping is structural, not instruction-dependent. The prose became belt-and-suspenders.
Why this was safe to run autonomously on a live trading system: isolation did the work, not caution. Because tests run from a separate git worktree, every test and probe physically could not reach the live memory/. Fan-out gave speed; the serialized full-suite gate plus adversarial review gave the certainty.
Where the brief was wrong
The plan was written from memory. Five of its “facts” didn’t survive contact with the actual code — each corrected with evidence and documented in the PR.
| Area | The brief said | Reality — what shipped |
|---|---|---|
| Cron times | Add advisors in UTC | The box clock is Pacific and every existing entry is PT-natural; UTC would drift an hour across DST. Shipped PT-natural, staggered 02:45 / 03:30 / 04:15. |
| Capital reset | One-command Alpaca reset API | No such API exists — paper reset is dashboard-only and invalidates keys. Rebuilt the helper as key-rotation + preflight-revalidation. |
| Broker fields | Shorting / margin already exposed | Not mapped at all. Added shorting_enabled and multiplier to the account model for the go-live gate. |
| Grading | Tag picks by strategy to isolate | The graders don’t filter by tag — so each account got its own picks & learnings files instead. Shared graders left untouched. |
| Blast radius | Just memory_dir() and callers | The trader’s live-money state runs through a separate hard-wired path class the brief never mentioned — the real #1 risk. |
Nothing was deployed
This box has passwordless root, so nothing technical stopped a deploy — the discipline was the whole point. Every claim below is a tool-verified fact, not a promise.
/etc/systemd? memory/ written? What you need to do
The build is complete; deploy is manual because none of it can be automated (Alpaca has no API for it) and all of it is behind the hard gate.
.env.nasdaq100 / .env.sp500.stock-bot, ALPACA_BASE_URL pointed at paper. Templates are in the repo as .env.<id>.example.scripts/broker_preflight.py --account <id>.enable --now.deploy/templates/*.service into /etc/systemd/system and append the new crontab.txt lines. Merge the PR when you’re satisfied.To re-fund a blown account later: reset it in the Alpaca dashboard (this regenerates the keys), then run scripts/account_refund.py --account <id> --key … --secret … — it rewrites the env file and re-runs the preflight. An account out of buying power simply stops trading until you do; that reads as “out of capital,” not “crashed.”
Documented follow-ups
Surfaced by the ship-gate review; none affect isolation, safety, or the live bots. Listed in the PR so nothing is silently dropped.
- med strategy_version labeling — account picks currently inherit a generic tag instead of
<id>-max-aggro-2026-07-03. Purely a label for later cohort analysis; picks are already file-isolated and grade correctly. - low allowlist gate reads the equity
enabledflag (benign while equity is enabled). - low aggression is unconditional in the per-account config — safe for paper-only accounts; don’t point
.env.<id>at live credentials. - low Two test-hardening additions (an anti-pattern guard for the isolation invariant; one symmetric coverage test).