docs/assistant/ModelMergeAndConverter_en.md
Both
ModelMergeAndConverterandModelConvertercan convert model files from binary format to text format. The difference is that,ModelConverterproduces multiple model files as results, whereasModelMergeAndConverterwill merge the model partitions at first, and output the results to a single file. SinceModelMergeAndConverterneeds to merge the model into a whole in-memory, and the same file cannot be written in concurrency,ModelMergeAndConverterwill generally take more time in terms of conversion as well as I/O. Hence, we recommend usingModelMergeAndConverteronly when the model is not extremely large.
The default design of Angel's model files is as follows:
Angel currently supports two modes to start up a conversion tool: the client mode and the Angel task mode.
The client mode is to perform a conversion task directly on the machine that runs the script (usually the client machine that submits tasks). The Angel task mode, on the other hand, is to launch a Angel Yarn Job, and execute the conversion program with Angel's Worker as the job's container. It is generally recommended to use the client mode, which is simpler to use. However, you can choose to use Angel task mode if the client resources are limited, since the conversion process will require considerable CPU and network IO resources (especially when the model is extremely large).
The command to submit a Angel ModelConverter job is:
./bin/angel-model-mergeconvert \
--angel.load.model.path ${anywhere} \
--angel.save.model.path ${anywhere} \
--angel.modelconverts.model.names ${models} \
--angel.modelconverts.serde.class ${SerdeClass}
./bin/angel-submit \
--angel.app.submit.class com.tencent.angel.ml.toolkits.modelconverter.ModelMergeAndConverterRunner \
--angel.load.model.path ${anywhere} \
--angel.save.model.path ${anywhere} \
--angel.modelconverts.model.names ${models} \
--angel.modelconverts.serde.class ${SerdeClass}
com.tencent.angel.ml.toolkits.modelconverter.ModelMergeAndConverterRunnerConverted models are stored in rows
As demonstrated in the example below, head of the file holds the model's basic information (name, dimension, etc.). Each row's index will be identified as "rowIndex=xxx" at the beginning, followed by a series of key:value pair, with key and value respectively indicating the column index under this row and its corresponding value.
rowIndex=0
0:-0.004235138405748639
1:-0.003367253227582031
3:-0.003988846053264014
6:0.001803243020660425
8:1.9413353447408782E-4
...