Today I gave a tutorial on MLlib in PySpark. I post the notebook here for whoever could be interested =)
MLlib is a package of Spark (available also in PySpark).¶
MLlib is just a package of Spark, therefore, no need for extra intallation (once you have your Spark up and running). There are different (sub-)packages available in MLlib that can be useful for machine learning on big data.
In this lab we will see something from Statistics, Regression, Classification, and Clustering. But the documentation often comes with example, so I enourage you to take a look: MLlib on PySpark