MorphBoost (Adaptive Split Criterion)

MorphBoost is an opt-in training mode in AlloyGBM that augments the standard gradient-gain split criterion with a normalized information-theoretic term, plus several round-aware leaf adjustments. Implementation follows the formulation in Kriuk (2025), MorphBoost.

When To Use It

MorphBoost tends to help most on:

  • Tabular problems with low signal-to-noise ratio (financial residuals, Numerai-style returns, etc.) where the standard gain criterion can overfit to spurious local-best splits.

  • Workloads where you want the model to find structure that a pure-gradient-gain learner misses early in training.

It is not a strict upgrade — treat MorphBoost as a configuration to A/B against training_mode="auto" rather than a default replacement.

How It Works

For every candidate split, the gain is

gradient_score = standard XGBoost-style gradient gain
info_score     = normalized information-gain term over the partition
morph_weight   = tanh(iteration / 20)            # ramps in over training

gain = (1 - info_score_weight) * gradient_score
     +  info_score_weight * info_score * morph_weight
     +  optional balance penalty

In addition:

  • A per-class EMA over gradient statistics tracks recent training dynamics and shapes split selection during evaluation.

  • Leaf values are scaled by a depth-based penalty (depth_penalty_base ** (depth / 3)) and a per-iteration shrinkage (1 - morph_rate * progress).

  • An optional balance penalty discourages highly imbalanced splits.

Enabling It

Pass training_mode="morph" to any AlloyGBM estimator. The same parameter exists on GBMRegressor, GBMClassifier, and GBMRanker.

from alloygbm import GBMRegressor

model = GBMRegressor(
    n_estimators=1200,
    max_depth=6,
    learning_rate=0.05,
    training_mode="morph",
    seed=7,
)
model.fit(X_train, y_train)

training_mode accepts "auto" (default), "manual", or "morph".

Parameters

All MorphBoost-related parameters are exposed as top-level keyword arguments on the estimator; the table below notes any mode-specific behavior.

Parameter

Default

Description

morph_rate

0.1

Per-iteration leaf shrinkage rate. Range [0.0, 1.0].

evolution_pressure

0.2

Strength of EMA-driven gain shaping. Range [0.0, 1.0].

morph_warmup_iters

5

Initial rounds for which the morph blend collapses to the pure gradient gain.

info_score_weight

0.3

Mixing weight for the information-theoretic term post-warmup. Range [0.0, 1.0]. 0.0 disables the info-theoretic term.

depth_penalty_base

0.9

Base of the leaf depth penalty. Range (0.0, 1.0]. 1.0 disables the penalty.

balance_penalty

True

Whether to penalize highly imbalanced splits.

lr_schedule

"constant"

Per-iteration LR schedule. "constant" or "warmup_cosine". Independent of training_mode — usable on its own.

lr_warmup_frac

0.1

Fraction of n_estimators spent in the linear-warmup phase when lr_schedule="warmup_cosine". Range [0.0, 1.0]. Must be left at the default 0.1 when lr_schedule="constant"; non-default values with a constant schedule raise ValueError.

Learning-Rate Schedules

lr_schedule is independent of training_mode. Two schedules are supported:

  • "constant" (default) — single fixed learning_rate for all rounds.

  • "warmup_cosine" — linear warmup from a small fraction of learning_rate up to learning_rate over the first lr_warmup_frac * n_estimators rounds, then half-cosine decay to a floor of 0.01 * learning_rate over the remainder.

The warmup-cosine schedule is most useful at very low learning_rate and high n_estimators (e.g. n_estimators=5000, learning_rate=0.01).

model = GBMRegressor(
    n_estimators=5000,
    learning_rate=0.01,
    training_mode="morph",
    lr_schedule="warmup_cosine",
    lr_warmup_frac=0.1,
)

When a non-constant LR schedule is active, AlloyGBM’s auto-stopping logic becomes schedule-aware: the auto-tuned min_loss_improvement threshold is scaled by current_lr / max_lr, and empty / slightly-negative rounds during the explicit warmup phase do not terminate training.

Combining with piecewise-linear leaves

training_mode="morph" composes with leaf_model="linear". The MorphBoost gain criterion drives split selection, and each resulting leaf still fits a closed-form linear model via the ridge solve. Pair with lambda_l2 >= 0.01 for weight stability. See GBMRegressor for the full leaf_model reference.

Combining with DRO leaves

training_mode="morph" also composes with leaf_solver="dro" when leaf_model="constant". The robust gradient gain and scalar leaf value are computed first; MorphBoost then blends the robust gradient gain with its information score and applies the usual depth/iteration leaf scaling.

Persistence

Models trained with training_mode="morph" save and load identically to auto-mode models — pickle, save_model / load_model, and raw artifact export all work without extra steps. The morph configuration is embedded as an optional artifact section so loaded models predict consistently.