Comprehensive Data Mining and Machine Learning course with Python and Spark

June 24, 2017, 1:38 p.m. By: Vishakha Jha

Data Mining and Machine Learning book

The book welcomes you to the world of data science by providing you with the knowledge of tools and techniques of data analysis. It contains efficient Machine Learning models in Python using the supervised and unsupervised learning methods and also contains information on data mining and large-scale machine learning through Apache Spark. The book is written by Frank Kane, founder of Sundog Software, which focuses on virtual reality environment technology and teaching others about big data analysis. The book is for Data Analyst and Software developers or programmers who want to get indulge into the data science concepts.

The concept of Data Science includes the data inference, algorithm development, and technology to solve complex problems analytically which are considered to be critical, whereas Machine learning focuses on the development of machines through data, experience and interaction.

The book provides you with machine learning and data mining techniques that include Regression analysis, K-Means Clustering concepts, Decision Trees and Random Forests. It also covers Principal Component Analysis, Test and cross-validation along with Multivariate Regression. It also focuses on Reinforcement Learning, Ensemble Learning and Experimental Design and along with some other important concepts. The book also covers an entire section on machine learning with Apache Spark, which would enable you to use these techniques to big data analysis on a computing cluster.

Book will teach you to clean the data and prepare it for analysis through implementing the clustering and regression methods in Python. It also includes efficient machine learning models using Decision Trees and Random Forests and assessment of the results of your analysis through Python's Matplotlib library. It provides you with knowledge of using Apache Spark's MLLib package to perform machine learning on large datasets. It book requires you to have a basic programming knowledge of Python and will teach you the basic techniques used by real-world industry data scientists. It is also available in form of video description.

More information: Packtpub & Udemy