docs/apis/Task_en.md
Task is Angel's metacomputing class. All the machine-learning algorithms on Angel need to inherit Task to implement the
trainorpredictprocesses. Tasks run within the worker and can share certain resources in the worker.
Understanding Task enhances understanding of the programming principles for Angel.
A Task's execution process is shown in the following chart:
Task's basic process consists of two steps:
Reading the training data
Raw data live on top of the distributed file system (DFS) and are not immediately usable by machine-learning algorithms in their raw format. Angel therefore abstracts out the process of preparing data for training, where Task pulls data to local from DFS, analyzes and transforms data to have the desired structure as DataBlock. This step includes preProcess and parse.
Computing (train or predict)
Also called the run step. Generally, for model training, this step runs the iterative training procedure (thus data are used for computing for many times) and outputs the trained model; for prediction, this step generates prediction using the trained model (thus data are used for computing just once) and outputs the model prediction.
In order for the application to be able to customize its computing procedure, Angel abstracts out BaseTaskInterface, and provides base classes such as BaseTask, TrainTask, and PredictTask, which can be extended to fulfill the specific requirements of the application.
Task will need to access the system config information and control the iteration progress during computing, provided by TaskContext.
BaseTaskInterface defines the interface to an algorithm's computing procedure. KEYIN and VALUEIN indicate the raw data type; VALUEOUT indicate type of the pre-processed data (input data for training)
parse
VALUEOUT parse(KEYIN key, VALUEIN value)preProcess
void preProcess(TaskContext taskContext)run
void run(TaskContext taskContext) throws AngelExceptionIn order to further simplify the application's programming interface, Angel defines two subclasses of BaseTask, TrainTask and PredictTask, whose VALUEOUT are both LabeledData, used under the train and predict modes, respectively. TrainTask and PredictTask can both be extended to address specific requirements of the application.
void train(TaskContext taskContext)predict
def predict(taskContext: TaskContext)The application can get task config and task execution information through TaskContext. In addition, intermediate metrics can be saved in TaskContext and visible in the application UI.
getReader
<K, V> Reader<K, V> getReader()getConf
Configuration getConf()getTotalTaskNum
int getTotalTaskNum()getIteration
int getIteration()incIteration
void incIteration()getMatrixClock
int getMatrixClock(int matrixId)globalSync
void globalSync()setCounter
void setCounter(String counterName, int updateValue)updateCounter
void updateCounter(String counterName, int updateValue)