Performance Optimization
This guide covers performance optimization features and techniques for Merlin.
Backtest Events Caching
Merlin includes a caching system for backtest events that improves performance by storing API responses locally. This is particularly useful when analyzing large backtests with many events, as it reduces API calls and improves execution speed for repeated operations.
How Caching Works
The caching feature stores backtest events retrieved from the MesoSim API locally in the ./merlin-cache directory.
When requesting backtest events that have been previously cached, Merlin will retrieve them from the local cache instead of making another API call.
The cache uses gzipped JSON files to store data efficiently, with cache keys generated using MD5 hashing based on the MesoSim instance, backtest ID, and event type.
Cache Configuration
You can configure the caching behavior using environment variables:
# Disable caching entirely
export MERLIN_CACHE_DISABLED=1
merlin optimize-strategy configs/strategies/boxcar-strategy.json --data-coll-backtest-id f957415f-07e4-4922-8253-1835237f1e94
For Windows users, use set instead of export.
Cache Maintenance
Important: You should periodically clean the cache directory to free up disk space by removing old entries. This can be done manually by deleting old files from the MERLIN_CACHE_DIR directory.
# Example command to remove cache files older than 30 days
find ./merlin-cache -type f -mtime +30 -delete
Parallel Execution
Merlin supports parallel execution of API calls and backtests to improve performance. You can control the level of parallelism through environment variables:
# Control parallel execution
export BACKTESTS_IN_FLIGHT=50
export MODEL_CALLS_IN_FLIGHT=50
Adjust these values based on your cluster's resources.
As a rule of thumb, both BACKTESTS_IN_FLIGHT and MODEL_CALLS_IN_FLIGHT should be set to 50 * backend_node_count.
For minimal configuration (with two backend nodes) these values should be set to 100 each.