docs/algo/sona/swing_en.md
Swing is a similarity calculating method for "user-item" bipartite graph. Take the purchase graph for an example, the less two users' common purchases of items, the more similar these items are. The detailed formula is as below, in which 'Ui' indicates users purchased item i, "Iu" indicates the items that user u purchased, the value range for gamma is [-1, 0), indicates the penalty for large item set.
userId | itemIditemId itemId scoretrue / false, true is suggested when the distribution of graph vertices is unbalancedDISK_ONLY/MEMORY_ONLY/MEMORY_AND_DISKinput=hdfs://my-hdfs/data
output=hdfs://my-hdfs/output
source ./spark-on-angel-env.sh
$SPARK_HOME/bin/spark-submit \
--master yarn-cluster\
--conf spark.ps.instances=1 \
--conf spark.ps.cores=1 \
--conf spark.ps.jars=$SONA_ANGEL_JARS \
--conf spark.ps.memory=10g \
--name "swing angel" \
--jars $SONA_SPARK_JARS \
--driver-memory 5g \
--num-executors 1 \
--executor-cores 4 \
--executor-memory 10g \
--class org.apache.spark.angel.examples.graph.SwingExample \
../lib/spark-on-angel-examples-3.3.0.jar
input:$input output:$output sep:tab storageLevel:MEMORY_ONLY useBalancePartition:true \
partitionNum:4 psPartitionNum:1