docs/algo/sona/commonfriends_sona_en.md
Common friends algorithm, aims to mine the number of common friends of two users; as a common graph feature / indicator, it is often used to describe the relationship between users, and widely used in friend recommendation, community detection, acquaintance / Stranger analysis and other scenarios。
Common friends this algorithm can be used in three scenarios:
1、 Input a total number of relationship chains, calculate the number of common friends with existing links, and can be used to describe the degree of relationship closeness.
2、 Input the total relationship chain and the edge table of the common friends to be calculated, and calculate the number of the common friends of the specified connection, which can be used for connection prediction or reasoning.
3、Incremented computation, this is suitable when edges of a graph are incremented by time, and all edges are needed to be calculated at set intervals.
During the implementation of common friends, the adjacency table of vertices needs to be stored on multiple Parameter Servers,the calculation logic of common friends occurs in the worker, and it is necessary to pull the adjacency table of two vertices from Parameter Server to calculate the intersection, and obtain the number of common friends.
srcId | dstIdDISK_ONLY/MEMORY_ONLY/MEMORY_AND_DISKinput=hdfs://my-hdfs/data
extraInput=hdfs://my-hdfs/data
output=hdfs://my-hdfs/output
source ./spark-on-angel-env.sh
$SPARK_HOME/bin/spark-submit \
--master yarn-cluster\
--conf spark.ps.instances=1 \
--conf spark.ps.cores=1 \
--conf spark.ps.jars=$SONA_ANGEL_JARS \
--conf spark.ps.memory=10g \
--name "commonfriends angel" \
--jars $SONA_SPARK_JARS \
--driver-memory 5g \
--num-executors 1 \
--executor-cores 4 \
--executor-memory 10g \
--class org.apache.spark.angel.examples.cluster.CommonFriendsExample \
../lib/spark-on-angel-examples-3.3.0.jar
input:$input extraInput:$extraInput output:$output sep:tab storageLevel:MEMORY_ONLY useBalancePartition:true \
partitionNum:4 psPartitionNum:1 batchSize:3000 pullBatchSize:1000 src:1 dst:2 mode:yarn-cluster