Daily operations (runbook)

This page describes a practical, restart-safe workflow for running MesoLive automation day-to-day: startup, monitoring, reconnect recovery, and shutdown.

Persisted state (minimum viable)

To make your automation restart-safe, persist:

Event cursor: last processed EventSeqId (Event Hub)
Idempotency keys: any in-flight / recently submitted Control Hub intents (SendOrder, CancelOrder, ApplyPaperFills, etc.)
Async job ids: any in-flight JobId returned by prepare flows (entry/exit/adjustment)
Dedupe keys: processed EventId values (signals + updates) for your session window
Order/position mapping: your own mapping of (strategy/account/position) → live ids you care about

Startup checklist

Recommended connection order:

Control Hub: connect, then verify agents/accounts/strategies.
Data Hub: connect, then verify streaming data is flowing.
Event Hub: connect, replay history from your persisted cursor, then subscribe for live callbacks.

CLI health checks (examples package)

# Control Hub: verify connected providers/agents
python -m mesolive_sdk_examples.agents_example list

# Control Hub: verify accounts and strategies
python -m mesolive_sdk_examples.accounts_example list --limit 50
python -m mesolive_sdk_examples.strategies_example list --limit 50

# Data Hub: verify streaming quotes work (pick a symbol you have access to)
python -m mesolive_sdk_examples.data_example underlying stream --symbol SPX --seconds 10

# Event Hub: verify callbacks are arriving
python -m mesolive_sdk_examples.events_example dump --seconds 10

On startup (and after reconnect), do both:

Replay from your persisted cursor using GetEventsSince
Subscribe to live callbacks (often Strategies=None to subscribe to all)

Your handlers should tolerate overlap between replay results and live callbacks by deduping on EventId.

For background, see:

Monitoring open positions with Data Hub

Streams are best-effort; build monitoring as a loop that can restart:

bootstrap open positions by listing strategies and then listing open positions (paged)
run one stream task per monitored position (or per account/strategy), with backoff on failures
periodically resync from snapshots (Get*Snapshot) to correct drift
start/stop monitoring based on OnPositionUpdate callbacks (Created/Closed)

Reconnect recovery

After any disconnect/reconnect, assume you missed events:

resubscribe to strategies
replay history from the last persisted cursor
restart streams and resync with snapshots
reconcile in-flight jobs: for any pending JobId, call GetPreparePosition*Status(JobId) and complete the workflow based on the returned status/result

This job-status reconciliation avoids “stuck” workflows where you were waiting for a completion callback that never arrives.

Shutdown checklist

To stop cleanly:

stop accepting new signals/work
wait for in-flight workflows to complete (or mark them abandoned and rely on idempotency recovery on next start)
persist your latest EventSeqId cursor and any dedupe/idempotency state
disconnect from hubs (Event Hub → Data Hub → Control Hub)

Persisted state (minimum viable)​

Startup checklist​

CLI health checks (examples package)​

Replay + subscribe (Event Hub)​

Monitoring open positions with Data Hub​

Reconnect recovery​

Shutdown checklist​