**kmeans(x, k)**

*x - is numeric vector data,*

k - the number of clusters

k - the number of clusters

Please refer the documentation for other options of kmeans() function.

**Usage**

Let's generate sample data to use.

> df=data.frame(x=sample(1:800,100),y=sample(1:500,100))

> head(df) x y 1 485 448 2 292 37 3 46 67 4 582 293 5 218 63 6 580 196

Then, we cluster our 'df' data into 3 cluster groups.

> df.km=kmeans(df,3) > df.km # shows kmeans function results K-means clustering with 3 clusters of sizes 39, 35, 26 Cluster means: x y 1 654.8205 210.5385 2 191.0286 117.3143 3 313.1923 389.0769 Clustering vector: [1] 3 2 2 1 2 1 3 3 1 1 2 1 1 3 2 2 2 2 2 2 2 1 3 2 2 3 1 3 3 2 3 1 3 [34] 3 1 3 2 2 3 2 2 3 1 1 1 3 1 1 1 3 1 2 1 3 1 1 1 3 3 3 2 2 1 2 1 2 [67] 2 2 2 2 2 1 2 1 1 2 1 1 1 3 1 3 3 2 2 1 1 2 1 1 3 1 1 1 1 1 2 3 2 [100] 3 Within cluster sum of squares by cluster: [1] 977327.4 681782.5 565713.9 (between_SS / total_SS = 70.7 %) Available components: [1] "cluster" "centers" "totss" "withinss" [5] "tot.withinss" "betweenss" "size" "iter" [9] "ifault"

**Visualizing in graph**

Next, we plot clustered df.km data.

> plot(df[c("x","y")],col=df.km$cluster)

Finally, we add center points of each cluster in a graph.

` `

> points(df.km$centers,col=1:3,pch=c(6,7,8),cex=2)In this post, we have learned how to use the kmeans function to cluster dataset and visualize it in a plot.

Thank you for reading!

## No comments:

## Post a Comment