scientific-skills/scikit-survival/references/cox-models.md
Cox proportional hazards models are semi-parametric models that relate covariates to the time of an event. The hazard function for individual i is expressed as:
h_i(t) = h_0(t) × exp(β^T x_i)
where:
The key assumption is that the hazard ratio between two individuals is constant over time (proportional hazards).
Basic Cox proportional hazards model for survival analysis.
alpha: Regularization parameter (default: 0, no regularization)ties: Method for handling tied event times ('breslow' or 'efron')n_iter: Maximum number of iterations for optimizationfrom sksurv.linear_model import CoxPHSurvivalAnalysis
from sksurv.datasets import load_gbsg2
# Load data
X, y = load_gbsg2()
# Fit Cox model
estimator = CoxPHSurvivalAnalysis()
estimator.fit(X, y)
# Get coefficients (log hazard ratios)
coefficients = estimator.coef_
# Predict risk scores
risk_scores = estimator.predict(X)
Cox model with elastic net penalty for feature selection and regularization.
Ridge (L2): alpha_min_ratio=1.0, l1_ratio=0
Lasso (L1): l1_ratio=1.0
Elastic Net: 0 < l1_ratio < 1
l1_ratio: Balance between L1 and L2 penalty (0=Ridge, 1=Lasso)alpha_min_ratio: Ratio of smallest to largest penalty in regularization pathn_alphas: Number of alphas along regularization pathfit_baseline_model: Whether to fit unpenalized baseline modelfrom sksurv.linear_model import CoxnetSurvivalAnalysis
# Fit with elastic net penalty
estimator = CoxnetSurvivalAnalysis(l1_ratio=0.5, alpha_min_ratio=0.01)
estimator.fit(X, y)
# Access regularization path
alphas = estimator.alphas_
coefficients_path = estimator.coef_path_
# Predict with specific alpha
risk_scores = estimator.predict(X, alpha=0.1)
from sklearn.model_selection import GridSearchCV
from sksurv.metrics import concordance_index_censored
# Define parameter grid
param_grid = {'l1_ratio': [0.1, 0.5, 0.9],
'alpha_min_ratio': [0.01, 0.001]}
# Grid search with C-index
cv = GridSearchCV(CoxnetSurvivalAnalysis(),
param_grid,
scoring='concordance_index_ipcw',
cv=5)
cv.fit(X, y)
# Best parameters
best_params = cv.best_params_
Inverse probability of censoring weighted Ridge regression for accelerated failure time models.
AFT models assume features multiply survival time by a constant factor, rather than multiplying the hazard rate. The model predicts log survival time directly.
from sksurv.linear_model import IPCRidge
# Fit IPCRidge model
estimator = IPCRidge(alpha=1.0)
estimator.fit(X, y)
# Predict log survival time
log_time = estimator.predict(X)
Use CoxPHSurvivalAnalysis when:
Use CoxnetSurvivalAnalysis when:
Use IPCRidge when:
The proportional hazards assumption should be verified using:
If violated, consider: