scientific-skills/scikit-learn/references/supervised_learning.md
Supervised learning algorithms learn from labeled training data to make predictions on new data. Scikit-learn provides comprehensive implementations for both classification and regression tasks.
Linear Regression (sklearn.linear_model.LinearRegression)
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
Ridge Regression (sklearn.linear_model.Ridge)
alpha (regularization strength, default=1.0)from sklearn.linear_model import Ridge
model = Ridge(alpha=1.0)
model.fit(X_train, y_train)
Lasso (sklearn.linear_model.Lasso)
alpha (regularization strength)from sklearn.linear_model import Lasso
model = Lasso(alpha=0.1)
model.fit(X_train, y_train)
# Check which features were selected
print(f"Non-zero coefficients: {sum(model.coef_ != 0)}")
ElasticNet (sklearn.linear_model.ElasticNet)
alpha, l1_ratio (0=Ridge, 1=Lasso)from sklearn.linear_model import ElasticNet
model = ElasticNet(alpha=0.1, l1_ratio=0.5)
model.fit(X_train, y_train)
Logistic Regression (sklearn.linear_model.LogisticRegression)
C (inverse regularization), penalty ('l1', 'l2', 'elasticnet')from sklearn.linear_model import LogisticRegression
model = LogisticRegression(C=1.0, max_iter=1000)
model.fit(X_train, y_train)
probas = model.predict_proba(X_test)
Stochastic Gradient Descent (SGD)
SGDClassifier, SGDRegressorloss, penalty, alpha, learning_ratefrom sklearn.linear_model import SGDClassifier
model = SGDClassifier(loss='log_loss', max_iter=1000, tol=1e-3)
model.fit(X_train, y_train)
SVC (sklearn.svm.SVC)
C, kernel ('linear', 'rbf', 'poly'), gammafrom sklearn.svm import SVC
# Linear kernel for linearly separable data
model_linear = SVC(kernel='linear', C=1.0)
# RBF kernel for non-linear data
model_rbf = SVC(kernel='rbf', C=1.0, gamma='scale')
model_rbf.fit(X_train, y_train)
SVR (sklearn.svm.SVR)
epsilon (tube width)from sklearn.svm import SVR
model = SVR(kernel='rbf', C=1.0, epsilon=0.1)
model.fit(X_train, y_train)
DecisionTreeClassifier / DecisionTreeRegressor
max_depth: Maximum tree depth (prevents overfitting)min_samples_split: Minimum samples to split a nodemin_samples_leaf: Minimum samples in leafcriterion: 'gini', 'entropy' for classification; 'squared_error', 'absolute_error' for regressionfrom sklearn.tree import DecisionTreeClassifier
model = DecisionTreeClassifier(
max_depth=5,
min_samples_split=20,
min_samples_leaf=10,
criterion='gini'
)
model.fit(X_train, y_train)
# Visualize the tree
from sklearn.tree import plot_tree
plot_tree(model, feature_names=feature_names, class_names=class_names)
RandomForestClassifier / RandomForestRegressor
n_estimators: Number of trees (default=100)max_depth: Maximum tree depthmax_features: Features to consider for splits ('sqrt', 'log2', or int)min_samples_split, min_samples_leaf: Control tree growthfrom sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(
n_estimators=100,
max_depth=10,
max_features='sqrt',
n_jobs=-1 # Use all CPU cores
)
model.fit(X_train, y_train)
# Feature importance
importances = model.feature_importances_
GradientBoostingClassifier / GradientBoostingRegressor
n_estimators: Number of boosting stageslearning_rate: Shrinks contribution of each treemax_depth: Depth of individual trees (typically 3-5)subsample: Fraction of samples for training each treefrom sklearn.ensemble import GradientBoostingClassifier
model = GradientBoostingClassifier(
n_estimators=100,
learning_rate=0.1,
max_depth=3,
subsample=0.8
)
model.fit(X_train, y_train)
HistGradientBoostingClassifier / HistGradientBoostingRegressor
from sklearn.ensemble import HistGradientBoostingClassifier
model = HistGradientBoostingClassifier(
max_iter=100,
learning_rate=0.1,
max_depth=None, # No limit by default
categorical_features='from_dtype' # Auto-detect categorical
)
model.fit(X_train, y_train)
AdaBoost
n_estimators, learning_rate, estimator (base estimator)from sklearn.ensemble import AdaBoostClassifier
model = AdaBoostClassifier(n_estimators=50, learning_rate=1.0)
model.fit(X_train, y_train)
Voting Classifier / Regressor
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
model = VotingClassifier(
estimators=[
('lr', LogisticRegression()),
('dt', DecisionTreeClassifier()),
('svc', SVC(probability=True))
],
voting='soft'
)
model.fit(X_train, y_train)
Stacking Classifier / Regressor
final_estimator (meta-learner)from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
model = StackingClassifier(
estimators=[
('dt', DecisionTreeClassifier()),
('svc', SVC())
],
final_estimator=LogisticRegression()
)
model.fit(X_train, y_train)
KNeighborsClassifier / KNeighborsRegressor
n_neighbors: Number of neighbors (default=5)weights: 'uniform' or 'distance'metric: Distance metric ('euclidean', 'manhattan', etc.)from sklearn.neighbors import KNeighborsClassifier
model = KNeighborsClassifier(n_neighbors=5, weights='distance')
model.fit(X_train, y_train)
GaussianNB, MultinomialNB, BernoulliNB
from sklearn.naive_bayes import GaussianNB, MultinomialNB
# For continuous features
model_gaussian = GaussianNB()
# For text/count data
model_multinomial = MultinomialNB(alpha=1.0) # alpha is smoothing parameter
model_multinomial.fit(X_train, y_train)
MLPClassifier / MLPRegressor
hidden_layer_sizes: Tuple of hidden layer sizes, e.g., (100, 50)activation: 'relu', 'tanh', 'logistic'solver: 'adam', 'sgd', 'lbfgs'alpha: L2 regularization parameterlearning_rate: 'constant', 'adaptive'from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler
# Scale features first
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
model = MLPClassifier(
hidden_layer_sizes=(100, 50),
activation='relu',
solver='adam',
alpha=0.0001,
max_iter=1000
)
model.fit(X_train_scaled, y_train)
Dataset size:
Interpretability:
Accuracy vs Speed:
Feature types:
Common starting points: