K-medoids聚类

2014-09-07

k-medoids聚类也可以叫做K中心点聚类,属于划分算法,在维基百科上给出了很详细的解释和示例,见k-medoids。这个方法和K-means很像,但不是K-means的变种。像k-medians等聚类方法,可以看做K-means的变种。

相对于K-means而言,k-medoids的优点是聚类结果不易受离群点、异常值的影响,缺点是算法复杂度稍高。

medoids在谷歌翻译中,翻译为中心点

维基百科给出的算法如下:

  1. Initialize: randomly select (without replacement) k of the n data points as the medoids
  2. Associate each data point to the closest medoid. (“closest” here is defined using any valid distance metric, most commonly Euclidean distance, Manhattan distance or Minkowski distance)
  3. For each medoid m {
     For each non-medoid data point o {
         Swap m and o and compute the total cost of the configuration
     }
    }
    
  4. Select the configuration with the lowest cost.
  5. Repeat steps 2 to 4 until there is no change in the medoid.

再添加个资料:Partitioning Around Medoids (PAM)

( 完 )