43. Introducing MLLib

INTRODUCING MLLIB

Movie recommendations using Spark's machine learning library

MLLIB Capabilities

  • Feature extraction

    • Term Frequency /Inverse Document Frequency useful for search

  • Basic statistics

    • Chi-squared test, Pearson or Spearman correlation, min, max, mean, variance

  • Linear regression, logistic regression

  • Support Vector Machines

  • Naives Bayes Classifier

  • Decision trees

  • K-Means clustering

  • Principal component analysis, singular value decomposition

  • Recommendations using Alternating Least Squares

Special MLLIB Data Types

  • Vector (dense or sparse)

  • LabeledPoint

  • Rating

For More Depth

  • I really like "Advanced Analytics with Spark" from O'Reilly

Let's Make Some Movie Recommendations

  • IT'S JUST THAT EASY ... Let's run it.

Last updated