Open source · Self-hosted · Quant-grade

StockMachina

Systematic trading.
Your hardware. Your edge.

StockMachina is the open-source end-to-end stack for swing trading US Equities and Hong Kong (HKEX) — built on institutional data quality, a local AI inference server, and a research process that makes look-ahead bias and survivorship leakage hard to introduce by accident.

★ Star on GitHub See the architecture →View roadmap

600+

Tickers (US + HK)

10+ yrs

Point-in-time history

100%

Local inference

OSS

Apache 2.0

Philosophy

The bottleneck isn't compute. It's discipline.

Most retail trading tooling ships a chart, a strategy editor and a backtester — and then quietly leaks the future into the past. StockMachina is built around the opposite premise: make the correct thing trivial, and the wrong thing hard. Point-in-time data, corporate-action adjustments, survivorship tracking and reproducibility are not features — they're the floor.

Point-in-time correctness

Every feature, every backtest, every signal sees only what was known at that instant. No look-ahead. No fundamentals leakage.

Sovereign by default

Models, data and strategies stay inside your network. Inference runs locally on hardware you own. Cloud is opt-in, per workload.

Reproducibility, bit-exact

Any backtest can be replayed exactly from a commit hash plus a data snapshot. No hidden state, no silent random seeds.

Architecture

Seven layers. Three nodes. One repository.

A composable pipeline from raw market data to executed orders — each layer with a single, contract-stable responsibility.

External Sources

IBKR Gateway, Alpaca, Polygon, EODHD, Tiingo, FRED. Pluggable providers behind a single ingestion contract.

Ingestion

Per-provider ingesters with rate-limiting, idempotency keys and dead-letter queues. Bad data is quarantined, not silently dropped.

Storage (Hot · Cold · Meta)

QuestDB for live tick / bar streams, Apache Parquet + DuckDB for the historical archive, SQLite for metadata and symbology.

Feature Engineering

Price, volume, microstructure, technicals, cross-sectional, sentiment, macro — every feature timestamped at the moment it became knowable.

Inference & Models

Forecast Service (Kronos, Chronos-2, MOIRAI-2), Reasoning Service (Qwen3 / Trading-R1) and Sentiment Service (FinGPT) — all served via a local OpenAI-compatible endpoint.

Strategy & Signals

Signal Generator → Portfolio Constructor → Volatility-adjusted sizing. Calibrated confidence and sector/region risk budgets baked in.

Execution & Risk

Order Manager guarded by a Risk Engine: order-level limits, portfolio circuit breakers, sectoral and regional exposure caps. Orders flow back to IBKR over a controlled channel with full audit.

$ stockmachina up --config ./stockmachina.yaml

▸ loading universe: us_core (510) + us_etf (40) + hk_lead (100) + adrs

▸ verifying point-in-time integrity… 10y window OK

▸ booting vLLM on dgx-spark: kronos-1.0, qwen3-32b, fingpt-v3

▸ subscribing IBKR Gateway → market data + account stream

▸ scheduling: ingest@close, signals@T-30m, reconcile@T+5m

▸ risk engine armed: pos=5% sec=25% region=30% gross=150%

✓ StockMachina online · http://stockmachina.internal

Mac mini

Orchestrator: Prefect, IBKR Gateway, QuestDB, Order Manager. The brain that schedules and supervises.

DGX Spark

Inference: vLLM + Forecast Service + research notebooks. 128 GB unified memory, ~1 PFLOP FP4.

Cloud (optional)

Reserved for hard reasoning workloads where a frontier model clearly beats local. Allow-listed. Logged.

Features

Everything a serious quant pipeline needs. Nothing it doesn't.

Point-in-time data

Corporate actions, splits, dividends, delisted tickers — handled. No survivorship bias by construction.

Local AI inference

Kronos, Chronos-2, MOIRAI-2 and a reasoning LLM served via vLLM on your hardware. OpenAI-compatible API.

Multi-market

US Equities (S&P 500 + your holdings) and Hong Kong (HKEX leaders) — native, not bolted on.

Reproducible backtests

Same code hash + same data snapshot = same result. Forever. Auditable from git log alone.

Two-engine backtesting

vectorbt for fast research sweeps, NautilusTrader for production-grade event-driven simulation.

Risk engine, multi-layer

Per-order limits, kill switches, portfolio circuit breakers, sectoral and regional exposure caps.

IBKR native execution

Direct integration with Interactive Brokers via ib_async. Cash long/short equities, ETFs optional.

Full observability

Prometheus metrics, Grafana dashboards, structlog and drift alerts on data and model performance.

Model ensemble

A specialized model for every layer of the decision.

Forecasts come from time-series foundation models. Reasoning comes from a domain-specialized LLM. Sentiment comes from a sentiment model. No single network pretends to do everything — they each do their job, well, and locally.

Kronos

Open weights

Time-Series

The headline forecaster. Pre-trained on candlesticks from 45+ exchanges — domain-specific to financial markets, with confidence intervals out of the box.

MIT

Chronos-2

Amazon

Time-Series

Generic time-series foundation model. Great for cross-series ensembling, macro indicators and any non-OHLCV signal where Kronos isn’t the right fit.

Apache 2.0

MOIRAI-2

Salesforce

Time-Series

Multivariate forecasting at scale. Use when you want a single model that consumes price, volume and macro features jointly — the ensemble's second voice.

Apache 2.0

Qwen 3 32B

Alibaba

LLM

General reasoning, summarization of news and earnings calls, multilingual coverage. Quantized to NVFP4 to fit comfortably on the DGX Spark with room for friends.

Apache 2.0

Trading-R1

Open weights

Reasoning

Specialized reasoning for trade-level decisions. Produces explicit chains of thought you can audit — and reject — before any order is placed.

MIT

FinGPT

Open weights

Sentiment

Financial sentiment scoring on news, filings and analyst notes. Fine-tuned on financial text rather than generic web — the right tool for the job.

MIT

BGE-M3

BAAI

Embeddings

Multilingual embeddings for the news + filings retrieval store. Powers the ‘why is this stock moving?’ lookup before signals fire.

MIT

Bring your own

You

LLM

Drop in any GGUF or safetensors checkpoint. Swap a model with one config line — fine-tunes, frontier API providers and experimental weights welcome.

Any

All inference exposed via a single OpenAI-compatible endpoint. Same client, different model.

The stack

100% open-source. Boring where it should be, sharp where it matters.

Battle-tested infrastructure for the boring parts (storage, scheduling, observability), modern open-weight models for the parts that move the P&L.

Data

QuestDB (hot)
Parquet + DuckDB (cold)
SQLite (metadata)
Polars
Great Expectations

Runtime

Python 3.11
vLLM
Prefect 2.x
Redis Streams
Typer (CLI)
uv
Docker

Modeling

Kronos
Chronos-2
MOIRAI-2
Qwen3
FinGPT
Trading-R1
BGE-M3

Backtest & Execute

vectorbt
NautilusTrader
ib_async (IBKR)
Pydantic Settings

Data providers

IBKR Gateway
Alpaca
EODHD
Tiingo
Polygon
Finnhub
FRED

Observability

Prometheus
Grafana
structlog
Drift alerts
Tailscale (private mesh)

Who it's for

Built for people who treat their capital like quants.

Independent quants

You already run capital through IBKR and you want a research pipeline that matches your seriousness.

Engineers entering trading

You can read code and reason about systems — and you want institutional discipline from day one.

TSFM researchers

You want a real, end-to-end testbed for Kronos, Chronos-2 and MOIRAI-2 on live financial data.

Builders of trading desks

Internal hedge desks that need a private, auditable, on-prem stack — without a $1M kdb+ license.

Not for: day traders, HFT, options multi-leg strategies, or anyone looking for a one-click bot. StockMachina is a workbench, not a vending machine.

Roadmap · V1

Six phases. Twelve to eighteen weeks.

Incremental, demoable milestones. Each phase produces a working slice of the system — no “big bang” integration at the end.

PHASE 01 · 2–3 weeks

Foundation

Storage + ingestion. 10 years of daily bars on 100 US tickers, end-to-end.

PHASE 02 · 1–2 weeks

Data quality

Corporate actions, adjustments, Great Expectations suites, cross-source validation.

PHASE 03 · 1–2 weeks

HK + complementary data

Hong Kong universe, EODHD, news (Tiingo), macro indicators (FRED).

PHASE 04 · 2–3 weeks

Local inference

DGX Spark online: vLLM + Qwen3 + FinGPT + Kronos serving via OpenAI-compatible API.

PHASE 05 · 3–4 weeks

Signals + backtests

First strategy (momentum_v1) targeting Sharpe > 0.8 and max drawdown < 20%.

PHASE 06 · 2–3 weeks

Execution & ops

Risk engine + Order Manager + 4 weeks of unattended paper trading before live capital.

Post-V1

Fine-tuning Kronos on your personal universe · Trading-R1 wired into the reasoning loop · Mean-reversion strategy · Multi-asset extensions · Mobile-friendly ops dashboard.

Open source

Inspectable. Forkable. Yours.

StockMachina is released under the Apache 2.0 license. No usage limits, no telemetry, no “enterprise tier” gating the parts that matter. The whole pipeline — from ingestion to execution — is in one repository, and any backtest you can't reproduce from that repository is, by definition, a bug.

★ Star on GitHub Read the docs

# Quick start

git clone https://github.com/draix/stockmachina

cd stockmachina

./scripts/bootstrap.sh

stockmachina up --config ./stockmachina.yaml

# Paper-trade for 4 weeks before live capital. No exceptions.

Make the correct thing trivial.

StockMachina is open source today. Bring your hardware, your strategy and your discipline — and skip the part where you reinvent point-in-time correctness from scratch.

★ Star on GitHub See the architecture

Systematic trading.Your hardware. Your edge.

The bottleneck isn't compute. It's discipline.

Point-in-time correctness

Sovereign by default

Reproducibility, bit-exact

Seven layers. Three nodes. One repository.

External Sources

Ingestion

Storage (Hot · Cold · Meta)

Feature Engineering

Inference & Models

Strategy & Signals

Execution & Risk

Mac mini

DGX Spark

Cloud (optional)

Everything a serious quant pipeline needs. Nothing it doesn't.

Point-in-time data

Local AI inference

Multi-market

Reproducible backtests

Two-engine backtesting

Risk engine, multi-layer

IBKR native execution

Full observability

A specialized model for every layer of the decision.

100% open-source. Boring where it should be, sharp where it matters.

Built for people who treat their capital like quants.

Independent quants

Engineers entering trading

TSFM researchers

Builders of trading desks

Six phases. Twelve to eighteen weeks.

Foundation

Data quality

HK + complementary data

Local inference

Signals + backtests

Execution & ops

Inspectable. Forkable. Yours.

Make the correct thing trivial.

Systematic trading.
Your hardware. Your edge.