GBMRanker
GBMRanker is the learning-to-rank estimator in AlloyGBM.
Overview
GBMRanker extends GBMRegressor with ranking-specific objectives. All
ranking objectives require query group identifiers to be passed in fit().
Data is sorted by group internally.
Quick example
from alloygbm import GBMRanker, ndcg
model = GBMRanker(
ranking_objective="rank:ndcg",
learning_rate=0.05,
max_depth=6,
n_estimators=300,
deterministic=True,
seed=7,
)
model.fit(X_train, y_train, group=query_ids_train)
scores = model.predict(X_test)
print("NDCG@10:", ndcg(y_test, scores, group=query_ids_test, k=10))
Ranking objectives
"rank:pairwise"– Pairwise logistic loss (RankNet)"rank:ndcg"– LambdaMART with NDCG weighting (default)"rank:xendcg"– Cross-entropy approximation to NDCG"queryrmse"– Query-grouped RMSE"yetirank"– YetiRank (stochastic NDCG-weighted pairwise)
As of v0.12.8, GBMRanker also accepts the regression objectives inherited
from GBMRegressor via ranking_objective=: "poisson", "gamma",
"tweedie" (log-link GLM, predict() returns exp(raw)), and
"quantile" (pinball loss, quantile_alpha ∈ (0.0, 1.0)).
Parameters
ranking_objective: str = "rank:ndcg"– the ranking loss function
All other parameters are inherited from GBMRegressor, including
leaf_solver="dro" for robust scalar leaves, leaf_model="linear" for
piecewise-linear leaves (see GBMRegressor), and training_mode="morph"
and the MorphBoost / LR-schedule parameters
(morph_rate, evolution_pressure, morph_warmup_iters,
info_score_weight, depth_penalty_base, balance_penalty,
lr_schedule, lr_warmup_frac). See MorphBoost (Adaptive Split Criterion).
leaf_model="linear" and training_mode="morph" can be combined.
boosting_mode="goss" with goss_top_rate / goss_other_rate
and boosting_mode="dart" with dart_drop_rate /
dart_max_drop / dart_normalize_type / dart_sample_type are
both supported on the ranking objective (see GBMRegressor
“Boosting mode” for the full semantics).
Methods
fit(X, y, *, group, eval_set=None, eval_group=None, ...)– trains the ranker.groupis required and provides per-row query identifiers.predict(X)– returns raw relevance scores (higher = more relevant)
Evaluation
from alloygbm import ndcg
score = ndcg(y_test, predictions, group=query_ids_test)
score_at_10 = ndcg(y_test, predictions, group=query_ids_test, k=10)
Group format
The group parameter accepts per-row group identifiers (e.g. query IDs).
AlloyGBM sorts by group internally, so rows do not need to be pre-sorted.
# Per-row group IDs (AlloyGBM format)
group = [0, 0, 0, 1, 1, 2, 2, 2, 2]
Early stopping
model = GBMRanker(
ranking_objective="rank:ndcg",
n_estimators=2000,
early_stopping_rounds=50,
)
model.fit(
X_train, y_train,
group=query_ids_train,
eval_set=(X_valid, y_valid),
eval_group=query_ids_valid,
)
Current scope
5 ranking objectives implemented natively in Rust, plus the 4 inherited regression objectives (
poisson,gamma,tweedie,quantile) as of v0.12.8Single-label per
GBMRanker. For multi-output ranking, seeMultiLabelGBMRanker(also covered in GBMRegressor). Joint shared-tree multi-label boosting is deferred to v0.10.0 (paired with the K-output shared-histogram primitive).Group identifiers must be unsigned integers