GraphFrames PySpark Example : Learn Data Science
In this post, GraphFrames PySpark example is discussed with shortest path problem. GraphFrames is a Spark package that allows DataFrame-based graphs in Saprk. Spark version 1.6.2 is considered for all examples. Including the package with PySaprk shell :
pyspark –packages graphframes:graphframes:0.1.0-spark1.6
Code:
1 | from pyspark import SparkContext |
Output:
id | inDegree |
---|---|
b | 2 |
c | 1 |
example : getting “follow” relationships in the graph
1 | g.edges.filter("relationship = 'follow'").count() |
Output:
2
getting shortest paths to “a” from each vertex
1 | results = g.shortestPaths(landmarks=\["a"\]) |
Feel free to ask your questions in the comments section!
GraphFrames PySpark Example : Learn Data Science