MorphBoost (Adaptive Split Criterion)

MorphBoost is an opt-in training mode in AlloyGBM that augments the standard gradient-gain split criterion with a normalized information-theoretic term, plus several round-aware leaf adjustments. Implementation follows the formulation in Kriuk (2025), MorphBoost.

When To Use It

MorphBoost tends to help most on:

Tabular problems with low signal-to-noise ratio (financial residuals, Numerai-style returns, etc.) where the standard gain criterion can overfit to spurious local-best splits.
Workloads where you want the model to find structure that a pure-gradient-gain learner misses early in training.

It is not a strict upgrade — treat MorphBoost as a configuration to A/B against training_mode="auto" rather than a default replacement.

How It Works

For every candidate split, the gain is

gradient_score = standard XGBoost-style gradient gain
info_score     = normalized information-gain term over the partition
morph_weight   = tanh(iteration / 20)            # ramps in over training

gain = (1 - info_score_weight) * gradient_score
     +  info_score_weight * info_score * morph_weight
     +  optional balance penalty

In addition:

A per-class EMA over gradient statistics tracks recent training dynamics and shapes split selection during evaluation.
Leaf values are scaled by a depth-based penalty (depth_penalty_base ** (depth / 3)) and a per-iteration shrinkage (1 - morph_rate * progress).
An optional balance penalty discourages highly imbalanced splits.

Enabling It

Pass training_mode="morph" to any AlloyGBM estimator. The same parameter exists on GBMRegressor, GBMClassifier, and GBMRanker.

from alloygbm import GBMRegressor

model = GBMRegressor(
    n_estimators=1200,
    max_depth=6,
    learning_rate=0.05,
    training_mode="morph",
    seed=7,
)
model.fit(X_train, y_train)

training_mode accepts "auto" (default), "manual", or "morph".

Parameters

All MorphBoost-related parameters are exposed as top-level keyword arguments on the estimator; the table below notes any mode-specific behavior.

Parameter	Default	Description
`morph_rate`	`0.1`	Per-iteration leaf shrinkage rate. Range `[0.0, 1.0]`.
`evolution_pressure`	`0.2`	Strength of EMA-driven gain shaping. Range `[0.0, 1.0]`.
`morph_warmup_iters`	`5`	Initial rounds for which the morph blend collapses to the pure gradient gain.
`info_score_weight`	`0.3`	Mixing weight for the information-theoretic term post-warmup. Range `[0.0, 1.0]`. `0.0` disables the info-theoretic term.
`depth_penalty_base`	`0.9`	Base of the leaf depth penalty. Range `(0.0, 1.0]`. `1.0` disables the penalty.
`balance_penalty`	`True`	Whether to penalize highly imbalanced splits.
`lr_schedule`	`"constant"`	Per-iteration LR schedule. `"constant"` or `"warmup_cosine"`. Independent of `training_mode` — usable on its own.
`lr_warmup_frac`	`0.1`	Fraction of `n_estimators` spent in the linear-warmup phase when `lr_schedule="warmup_cosine"`. Range `[0.0, 1.0]`. Must be left at the default `0.1` when `lr_schedule="constant"`; non-default values with a constant schedule raise `ValueError`.

Learning-Rate Schedules

lr_schedule is independent of training_mode. Two schedules are supported:

"constant" (default) — single fixed learning_rate for all rounds.
"warmup_cosine" — linear warmup from a small fraction of learning_rate up to learning_rate over the first lr_warmup_frac * n_estimators rounds, then half-cosine decay to a floor of 0.01 * learning_rate over the remainder.

The warmup-cosine schedule is most useful at very low learning_rate and high n_estimators (e.g. n_estimators=5000, learning_rate=0.01).

model = GBMRegressor(
    n_estimators=5000,
    learning_rate=0.01,
    training_mode="morph",
    lr_schedule="warmup_cosine",
    lr_warmup_frac=0.1,
)

When a non-constant LR schedule is active, AlloyGBM’s auto-stopping logic becomes schedule-aware: the auto-tuned min_loss_improvement threshold is scaled by current_lr / max_lr, and empty / slightly-negative rounds during the explicit warmup phase do not terminate training.

Combining with piecewise-linear leaves

training_mode="morph" composes with leaf_model="linear". The MorphBoost gain criterion drives split selection, and each resulting leaf still fits a closed-form linear model via the ridge solve. Pair with lambda_l2 >= 0.01 for weight stability. See GBMRegressor for the full leaf_model reference.

Combining with DRO leaves

training_mode="morph" also composes with leaf_solver="dro" when leaf_model="constant". The robust gradient gain and scalar leaf value are computed first; MorphBoost then blends the robust gradient gain with its information score and applies the usual depth/iteration leaf scaling.

Persistence

Models trained with training_mode="morph" save and load identically to auto-mode models — pickle, save_model / load_model, and raw artifact export all work without extra steps. The morph configuration is embedded as an optional artifact section so loaded models predict consistently.