Cluster Analysis and Unsupervised Machine Learning in Python

Data science techniques for pattern recognition, data mining, k-means clustering, and hierarchical clustering, and KDE.

What you’ll learn

Understand the regular K-Means algorithm
Understand and enumerate the disadvantages of K-Means Clustering
Understand the soft or fuzzy K-Means Clustering algorithm
Implement Soft K-Means Clustering in Code
Understand Hierarchical Clustering
Explain algorithmically how Hierarchical Agglomerative Clustering works
Apply Scipy’s Hierarchical Clustering library to data
Understand how to read a dendrogram
Understand the different distance metrics used in clustering
Understand the difference between single linkage, complete linkage, Ward linkage, and UPGMA
Understand the Gaussian mixture model and how to use it for density estimation
Write a GMM in Python code
Explain when GMM is equivalent to K-Means Clustering
Explain the expectation-maximization algorithm
Understand how GMM overcomes some disadvantages of K-Means
Understand the Singular Covariance problem and how to fix it

Students and professionals interested in machine learning and data science
People who want an introduction to unsupervised machine learning and cluster analysis
People who want to know how to write their own clustering code
Professionals interested in data mining big data sets to look for patterns automatically