docs/algo/lr_on_angel_en.md
Logistic Regression is a regression model where the dependent variable is categorical, thus also a classification model. It is simple but effective, widely used in a variety of applications such as the traditional advertising recommender system.
Logistic regression is a simple classification method. It assumes that the probability mass of class label y conditional on data point x, P(y|x), takes the logistic form:
Combining the two expressions above, we get:
The objective function of logistic regression is a weighted sum of log loss and L2 penalty:
where is the regularization term using the L2 norm.
LR algorithm can be abstracted as a 1×N PSModel, denoted by w, as shown in the following figure:
Angel MLLib provides LR algorithm trained with the mini-batch gradient descent method.
Worker:
In each iteration, worker pulls the up-to-date w from PS, updates the model parameters, △w, using the mini-batch gradient descent optimization method, and push △w back to PS.
PS:
In each iteration, PS receives △w from all workers, add their average to w,obtaining a new model.
Decaying learning rate
The learning rate decays along iterations as , where:
Model Type
The LR algorithm supports three types of models: DoubleDense, DoubleSparse, DoubleSparseLongKey. Use ml.lr.model.type to configure.
Algorithm Parameters
I/O Parameters
Resource Parameters
./bin/angel-submit \
--action.type train \
--angel.app.submit.class com.tencent.angel.ml.classification.lr.LRRunner \
--angel.train.data.path $input_path \
--angel.save.model.path $model_path \
--angel.log.path $logpath \
--ml.epoch.num 10 \
--ml.num.update.per.epoch 10 \
--ml.feature.index.range 10000 \
--ml.data.validate.ratio 0.1 \
--ml.data.type dummy \
--ml.learn.rate 1 \
--ml.learn.decay 0.1 \
--ml.lr.reg.l2 0 \
--angel.workergroup.number 3 \
--angel.worker.task.number 3 \
--angel.ps.number 1 \
--angel.ps.memory.mb 5000 \
--angel.job.name=angel_lr_smalldata
./bin/angel-submit \
--action.type predict \
--angel.app.submit.class com.tencent.angel.ml.classification.lr.LRRunner \
--angel.load.model.path $model_path \
--angel.predict.out.path $predict_path \
--angel.train.data.path $input_path \
--angel.workergroup.number 3 \
--ml.data.type dummy \
--angel.worker.memory.mb 8000 \
--angel.worker.task.number 3 \
--angel.ps.number 1 \
--angel.ps.memory.mb 5000 \
--angel.job.name angel_lr_predict