Skip to main content

SETS: Stepwise Ensemble for Trade Selection Model

Overview

The Stepwise Ensemble for Trade Selection (SETS) model is a specialized predictive modeling approach built specifically to address challenges in trading system development. SETS automatically discovers patterns in historical strategy performance and creates an optimized ensemble of regression models to predict future market behavior. These models can be used as trade filters and position sizing optimizers in quantitative trading strategies, including the ones developed using MesoSim.

SETS Model

Unlike general-purpose statistical modeling packages, SETS is purpose-built for trading strategy analysis, combining the power of multiple specialist regression models into a robust ensemble that maximizes predictive performance while reducing the risk of overfitting.

Key Concepts

Predictive Modeling in Trading

Predictive modeling for trading relies on a fundamental property of strategy performance: markets and strategies contain patterns that tend to repeat throughout history and can often be used to predict future activity.

These patterns may include:

  • Trend continuation until exhaustion
  • Retracement toward recent mean price after a sudden move
  • Seasonal patterns and cyclical behavior

A predictive model studies historical strategy performance to discover these repeating patterns. Once identified, the model monitors for their reoccurrence and predicts whether the position to be taken will win, lose or break-even.

Features and Targets

SETS uses two key components for building predictive models:

  • Features: Variables derived from historical data (prices, volumes, open interest, IVs, Greeks and traditional indicators) that look strictly backward in time
  • Targets: Forward-looking variables that reveal strategy performance, represented as StrategyNAV fields for each Symbol/Strategy

The fundamental goal is to find relationships between features and targets that can be exploited for profitable trading. Trade decisions are made by comparing the model's predictions to optimized thresholds:

  • If the prediction exceeds an upper threshold, take a long position
  • If the prediction falls below a lower threshold, take a short position (or just ignore the entry)

How SETS Works

The SETS model consists of specialist regression models, each covering a subset of features (selected manually or automatically). These models are then combined into an ensemble when EnsembleMode is set to Average or Optimized. If EnsembleMode is set to Disabled, only one regression model is built.

SETS supports multiple regression types, covering various degrees of nonlinearity, including:

Linear Regression

Linear regression models the relationship between features and target using a linear equation:

y = β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ + ε

SETS uses multivariable linear regression, which many experts consider the best all-around modeling approach for trading systems because it is:

  1. Fast to train
  2. Powerful when given well-designed features
  3. Less likely to overfit than nonlinear models, such as random forests or neural networks

Logistic Regression

Logistic regression transforms the continuous target variable into binary outcomes using the logistic function:

P(y=1) = 1 / (1 + e^(-(β₀ + β₁x₁ + ... + βₙxₙ)))

Logistic regression is similar to linear regression but designed for classification tasks. When the goal is to discriminate between two classes, logistic regression uses a binary target (zero or one) and is much more robust against outliers in the predictors compared to linear regression.

In logistic regression, SETS classifies position profits into Win or Loss buckets, then uses the classifier to predict trade outcomes (win/loss) based on the provided features.

Quadratic Regression

Quadratic regression extends the linear model by automatically generating squared terms and interaction features:

y = β₀ + β₁x₁ + β₂x₂ + β₃x₁² + β₄x₂² + β₅x₁x₂ + ε

In modeling tasks, there is an inherent tradeoff regarding the degree of nonlinearity. Linear models handle only straight-line relationships between the indicators and the target variable, reducing overfitting risk but ignoring curved effects.
At the other extreme, highly flexible methods (neural nets, decision trees) may capture rich structure yet are slower to train and prone to overfitting.

Quadratic (second-order polynomial) regression sits in the middle of this spectrum: it allows a single curvature (parabolic shape) in each predictor’s effect while keeping model complexity manageable. If a linear model under-performs, a quadratic one is typically the next step.

A quadratic model is essentially a linear model whose input set has been expanded with all squared terms and pairwise cross-products:

  • With one feature FEAT_1: FEAT_1, FEAT_1²
  • With two features FEAT_1, FEAT_2: FEAT_1, FEAT_2, FEAT_1², FEAT_1×FEAT_2, FEAT_2²
  • With three features: all original features plus their squares and all pairwise cross products

The number of terms grows rapidly as features increase, which could result in overfitting. However, quadratic regression remains much faster to train than most other nonlinear models while providing the flexibility to handle the majority of nonlinear relationships encountered in financial modeling.

Model Parameters

Basic Configuration

ParameterDescription
MinTimeInMarketPctThe minimum fraction (0-1) of training cases that must result in a trade being taken on each side. Reasonable values range from 0.05 to 0.20.
TargetMetricThe objective function to optimize for in models and ensemble. Options include RSquare, ROCArea, UlcerIndex, UlcerPerformanceIndex, ProfitFactor, LongProfitFactor, ShortProfitFactor.
CompressTargetOutliersWhen enabled, performs monotonic compression on target's extreme values to mitigate the effect of outliers.
ModelRegressionTypeThe regression type of the model. Either Linear, Quadratic or Logistic

Feature Selection

ParameterDescription
ModelMaxFeaturesMaximum number of features to include in one model. When greater than zero, stepwise selection is enabled. When set to zero, all features are used.
ModelStepwiseMemoryThe number of model parameters to remember during stepwise selection algorithm. Higher values check more combinations but increase computation time.
RegressionModelConfigsA dictionary mapping model names to individual model configurations that can override global settings. Each model config can specify: Features (list of features to use), MaxFeatures, and StepwiseMemory. When not set, all features are used with global settings. Model names follow the scheme MODEL_{number}.

RegressionModelConfigs Details

The RegressionModelConfigs parameter allows fine-grained control over individual regression models within the ensemble. It's particularly useful when you want to create specialized models that focus on different feature sets.

FieldTypeRequiredDescription
FeaturesList<string>NoAn explicit list of features to use in this model. When specified, only these features will be considered for inclusion in the model, ignoring all other features in the dataset. If not provided, all available features will be used (filtered by the MaxFeatures parameter if set).
MaxFeaturesintNoMaximum number of features to include in this specific model. When greater than zero, stepwise selection is enabled for this model. When set to zero, all specified features (or all available features if none specified) are used. If null, the global ModelMaxFeatures setting will be used.
StepwiseMemoryintNoThe number of model parameters to remember during stepwise selection algorithm for this specific model. Higher values check more combinations but increase computation time. If null, the global ModelStepwiseMemory setting will be used.
RegressionTypestringNoThe regression type of the model. Either Linear, Quadratic or Logistic

Example Usage:

"RegressionModelConfigs": {
"MODEL_0": {
"Features": ["entry_pos_vega", "entry_pos_theta", "entry_leg_longs_iv"],
"MaxFeatures": 2,
"StepwiseMemory": 5,
"RegressionType": "Linear"
},
"MODEL_1": {
"Features": ["OEX_RSI_14", "SGX_RSI_7", "SPX_RSI_14"],
"MaxFeatures": 3,
"StepwiseMemory": 8,
"RegressionType": "Quadratic"
}
}

In this example:

  • MODEL_0 will only consider the 3 specified features and select at most 2 of them using stepwise selection with a memory of 5
  • MODEL_1 will only consider its 3 specified features and select at most 3 of them (all of them) using stepwise selection with a memory of 8

Using RegressionModelConfigs this way allows you to create specialized models focusing on different aspects of your trading system (e.g., one model for momentum indicators, another for volatility indicators).

Ensemble Configuration

ParameterDescription
EnsembleModeControls how multiple models are combined. Options: Disabled (one model), Average (equal weights), or Optimized (weights based on performance).
EnsembleModelCountThe number of models to create when ensemble mode is enabled.
EnsembleMaxModelsMaximum number of models to select for the ensemble using the stepwise algorithm.
EnsembleStepwiseMemoryNumber of models to remember during stepwise selection for the ensemble.

Data Input

ParameterDescription
CsvContentCSV data containing Features and Target.

Stepwise Selection Algorithm

By default, SETS uses forward stepwise selection for feature selection:

  1. Each feature candidate is tested individually for performance
  2. The best performer is selected first
  3. Each remaining candidate is tested in combination with already selected features
  4. The process continues until ModelMaxFeatures is reached or performance stops improving

To mitigate the primacy problem (where excellent feature combinations might be missed), ModelStepwiseMemory allows the algorithm to retain multiple promising feature sets instead of just the single best performer. This significantly improves the chances of finding optimal feature combinations.

Ensemble Methods

SETS supports three ensemble modes:

Disabled

Only a single model is built and used. Simple but may miss opportunities for improved performance through model combination.

Average

All models receive equal weight in the ensemble. This approach is least likely to overfit but treats all models as equally important regardless of their individual performance.

Optimized

Models are weighted based on their relative importance or quality, with two constraints:

  1. No weight can be negative (avoiding prediction inversion)
  2. Weights must sum to one (preventing correlated models from producing extreme weights)

This provides an excellent compromise between equal weighting and unconstrained optimization, allowing better models to have more influence while preventing overfitting.

Mitigating Outliers

Most models are sensitive to outliers, which can force the model to expend excessive effort learning to predict extreme cases at the expense of typical scenarios. Enabling CompressTargetOutliers applies a monotonic compressing function to the tails of the target distribution, reducing the impact of outliers while preserving order relationships.

Target Metrics

SETS supports multiple optimization metrics:

MetricDescription
RSquareThe fraction of the target variable's variance explained by the model. If R-square is optimized, the threshold for computing performance criteria is set at zero.
ROCAreaThe area under the profit/loss ROC curve. A random model will have a value around 0.5, a perfect model 1.0.
ProfitFactorCombines LongProfitFactor and ShortProfitFactor with simultaneously optimized thresholds.
LongProfitFactorOptimizes for long positions by dividing the sum of positive target values by the sum of negative values for cases exceeding the threshold.
ShortProfitFactorSimilar to LongProfitFactor but optimized for short positions.
UlcerIndexThe square root of the mean squared drawdown. Available in long and short versions.
UlcerPerformanceIndexNet change in equity divided by the Ulcer Index. Available in long and short versions.

Usage Example

{
"MinTimeInMarketPct": 0.15,
"TargetMetric": "ProfitFactor",
"CompressTargetOutliers": true,
"ModelMaxFeatures": 2,
"ModelStepwiseMemory": 10,
"RegressionModelConfigs": {
"MODEL_0": {
"Features": ["Feature1", "Feature2", "Feature3"],
"MaxFeatures": 2,
"StepwiseMemory": 8
},
"MODEL_1": {
"Features": ["Feature4", "Feature5", "Feature6"],
"MaxFeatures": 3,
"StepwiseMemory": 10
}
},
"EnsembleMode": "Optimized",
"EnsembleModelCount": 5,
"EnsembleMaxModels": 3,
"EnsembleStepwiseMemory": 10,
"CsvContent": "DateTime,Symbol,StrategyNAV,Feature1,Feature2,Feature3,Feature4,Feature5,Feature6\n2023-01-01,STRAT_0,1000,0.5,0.7,0.3,0.1,0.9,0.2\n2023-01-02,STRAT_0,1005,0.6,0.5,0.4,0.2,0.8,0.3\n..."
}

API Endpoint

POST /models/v1/sets

Request Format

The request requires a JSON body with the configuration parameters described above.

Response Format

The response contains the model results including weights for each feature in each model and the optimal thresholds:

{
"Models": {
"MODEL_0": {
"Weights": {
"Feature1": 0.45,
"Feature2": 0.35,
"CONSTANT": 0.11
},
"LongThreshold": 0.15,
"ShortThreshold": -0.12
},
"MODEL_1": {
"Weights": {
"Feature4": 0.25,
"Feature5": 0.50,
"CONSTANT": 0.25
},
"LongThreshold": 0.18,
"ShortThreshold": -0.14
},
"ENSEMBLE": {
"Weights": {
"MODEL_0": 0.60,
"MODEL_1": 0.40
},
"LongThreshold": 0.16,
"ShortThreshold": -0.13
}
}
}

The trade direction is determined by comparing the model's prediction to the thresholds: Long position (or N times the normal allocation):

    ENSEMBLE.Weights.MODEL_0 * (Feature1 * MODEL_0.Weights.Weight1 + Feature2 * MODEL_0.Weights.Weight2 + MODEL_0.Weights.CONSTANT) + 
ENSEMBLE.Weights.MODEL_1 * (Feature4 * MODEL_1.Weights.Weight4 + Feature5 * MODEL_1.Weights.Weight5 + MODEL_1.Weights.CONSTANT) > ENSEMBLE.LongThreshold

Short position (or ignored entry):

    ENSEMBLE.Weights.MODEL_0 * (Feature1 * MODEL_0.Weights.Weight1 + Feature2 * MODEL_0.Weights.Weight2 + MODEL_0.Weights.CONSTANT) + 
ENSEMBLE.Weights.MODEL_1 * (Feature4 * MODEL_1.Weights.Weight4 + Feature5 * MODEL_1.Weights.Weight5 + MODEL_1.Weights.CONSTANT) < ENSEMBLE.ShortThreshold

When a single model is used (EnsembleMode is Disabled), the trade direction is determined by the model's prediction and thresholds:

    MODEL_0.Weights.Weight1 * Feature1 + MODEL_0.Weights.Weight2 * Feature2 + MODEL_0.Weights.CONSTANT > MODEL_0.LongThreshold

Notes and Limitations

  • The API currently supports up to 3069 features and 128 models
  • The model training process may take several minutes for large datasets
  • The HTTP Request body size is limited to 30MBs