MorphBoost (Adaptive Split Criterion)
MorphBoost is an opt-in training mode in AlloyGBM that augments the standard gradient-gain split criterion with a normalized information-theoretic term, plus several round-aware leaf adjustments. Implementation follows the formulation in Kriuk (2025), MorphBoost.
When To Use It
MorphBoost tends to help most on:
Tabular problems with low signal-to-noise ratio (financial residuals, Numerai-style returns, etc.) where the standard gain criterion can overfit to spurious local-best splits.
Workloads where you want the model to find structure that a pure-gradient-gain learner misses early in training.
It is not a strict upgrade — treat MorphBoost as a configuration to A/B
against training_mode="auto" rather than a default replacement.
How It Works
For every candidate split, the gain is
gradient_score = standard XGBoost-style gradient gain
info_score = normalized information-gain term over the partition
morph_weight = tanh(iteration / 20) # ramps in over training
gain = (1 - info_score_weight) * gradient_score
+ info_score_weight * info_score * morph_weight
+ optional balance penalty
In addition:
A per-class EMA over gradient statistics tracks recent training dynamics and shapes split selection during evaluation.
Leaf values are scaled by a depth-based penalty (
depth_penalty_base ** (depth / 3)) and a per-iteration shrinkage (1 - morph_rate * progress).An optional balance penalty discourages highly imbalanced splits.
Enabling It
Pass training_mode="morph" to any AlloyGBM estimator. The same
parameter exists on GBMRegressor,
GBMClassifier, and GBMRanker.
from alloygbm import GBMRegressor
model = GBMRegressor(
n_estimators=1200,
max_depth=6,
learning_rate=0.05,
training_mode="morph",
seed=7,
)
model.fit(X_train, y_train)
training_mode accepts "auto" (default), "manual", or
"morph".
Parameters
All MorphBoost-related parameters are exposed as top-level keyword arguments on the estimator; the table below notes any mode-specific behavior.
Parameter |
Default |
Description |
|---|---|---|
|
|
Per-iteration leaf shrinkage rate. Range |
|
|
Strength of EMA-driven gain shaping. Range |
|
|
Initial rounds for which the morph blend collapses to the pure gradient gain. |
|
|
Mixing weight for the information-theoretic term post-warmup.
Range |
|
|
Base of the leaf depth penalty. Range |
|
|
Whether to penalize highly imbalanced splits. |
|
|
Per-iteration LR schedule. |
|
|
Fraction of |
Learning-Rate Schedules
lr_schedule is independent of training_mode. Two schedules are
supported:
"constant"(default) — single fixedlearning_ratefor all rounds."warmup_cosine"— linear warmup from a small fraction oflearning_rateup tolearning_rateover the firstlr_warmup_frac * n_estimatorsrounds, then half-cosine decay to a floor of0.01 * learning_rateover the remainder.
The warmup-cosine schedule is most useful at very low learning_rate
and high n_estimators (e.g. n_estimators=5000,
learning_rate=0.01).
model = GBMRegressor(
n_estimators=5000,
learning_rate=0.01,
training_mode="morph",
lr_schedule="warmup_cosine",
lr_warmup_frac=0.1,
)
When a non-constant LR schedule is active, AlloyGBM’s auto-stopping logic
becomes schedule-aware: the auto-tuned min_loss_improvement threshold
is scaled by current_lr / max_lr, and empty / slightly-negative
rounds during the explicit warmup phase do not terminate training.
Combining with piecewise-linear leaves
training_mode="morph" composes with leaf_model="linear". The
MorphBoost gain criterion drives split selection, and each resulting leaf
still fits a closed-form linear model via the ridge solve. Pair with
lambda_l2 >= 0.01 for weight stability. See GBMRegressor for the
full leaf_model reference.
Combining with DRO leaves
training_mode="morph" also composes with leaf_solver="dro" when
leaf_model="constant". The robust gradient gain and scalar leaf value are
computed first; MorphBoost then blends the robust gradient gain with its
information score and applies the usual depth/iteration leaf scaling.
Persistence
Models trained with training_mode="morph" save and load identically
to auto-mode models — pickle, save_model / load_model, and
raw artifact export all work without extra steps. The morph configuration
is embedded as an optional artifact section so loaded models predict
consistently.