K-means Clustering

K-means Clustering
1.Introduction to K-means Clustering
K-means clustering is a type of unsupervised learning, which is used when you have data without defined categories or groups. The algorithm works iteratively to assign each data point to one of K groups where each data point belongs to only one group. It tries to make the inter-cluster data points as similar as possible while also keeping the clusters as far as possible.
2.Steps Involved in K-means Clustering
(1).Choose the number of clusters k
(2). Select k random points from the data as centroids
Given a set of observations (x1, x2, …, xn), where each observation is a d-dimensional real vector, k-means clustering aims to partition the n observations into k (≤ n) sets S = {S1, S2, …, Sk} so as to minimize the variance. Formally, the objective is to find:
在这里插入图片描述
where μi is the mean of points in Si.
(3).Assign all the points to the closest cluster centroid
(4).Recompute the centroids of newly formed clusters
(5).Repeat steps 3 and 4
There are essentially three stopping criteria that can be adopted to stop the K-means algorithm:
a.Centroids of newly formed clusters do not change
b.Points remain in the same cluster
c.Maximum number of iterations are reached

发布了19 篇原创文章 · 获赞 0 · 访问量 731

猜你喜欢

转载自blog.csdn.net/hahadelaochao/article/details/105537305