Tag: Regression

Posted 2017-11-20Updated 2023-08-22Big Data / Data Mining / Machine Learning / Apache Sparka minute read (About 134 words)

Logistic Regression with Spark : Learn Data Science

Logistic regression with Spark is achieved using MLlib. Logistic regression returns binary class labels that is “0” or “1”. In this example, we consider a data set that consists only one variable “study hours” and class label is whether the student passed (1) or not passed (0).

from pyspark import SparkContext
from pyspark import SparkContext
import numpy as np
from numpy import array
from pyspark.mllib.regression import LabeledPoint
from pyspark.mllib.classification import LogisticRegressionWithLBFGS

sc = SparkContext ()

def createLabeledPoints(label, points):
    return LabeledPoint(label, points)

studyHours = [
 [ 0, [0.5]],
 [ 0, [0.75]],
 [ 0, [1.0]],
 [ 0, [1.25]],
 [ 0, [1.5]],
 [ 0, [1.75]],
 [ 1, [1.75]],
 [ 0, [2.0]],
 [ 1, [2.25]],
 [ 0, [2.5]],
 [ 1, [2.75]],
 [ 0, [3.0]],
 [ 1, [3.25]],
 [ 0, [3.5]],
 [ 1, [4.0]],
 [ 1, [4.25]],
 [ 1, [4.5]],
 [ 1, [4.75]],
 [ 1, [5.0]],
 [ 1, [5.5]]
]

data = []

for x, y in studyHours:
data.append(createLabeledPoints(x, y))

model = LogisticRegressionWithLBFGS.train( sc.parallelize(data) )

print (model)

print (model.predict([1]))

Output:

1
2
3

spark-submit regression-mllib.py
(weights=[0.215546777333], intercept=0.0)
1

References:

Links

Categories

Recents

Archives

Tags

Subscribe for updates