asked 169k views
4 votes
K-Means Algorithm Given a dataset {0,2,4,6,24,26}, initialize the k-means clustering algorithm with 2 cluster centres c1=3 and c2=15. 1. What are the values of c1 and c2 after one iteration of k-means? [5 marks] 2. What are the values of 1 and 2 after the second iteration of k-means? [5 marks] 3. Describe two advantages two advantages of K-means clustering over hierarchical clustering. [2 marks] 4. List down three (3) stopping criteria for K-Mepns Algorithm [3 marks]

2 Answers

3 votes

Final answer:

After one iteration, the values of c1 and c2 in the K-Means algorithm for the given dataset are 3 and 25, respectively. These values remain the same after the second iteration. K-means is advantageous over hierarchical clustering for being computationally faster and often producing tighter clusters.

Step-by-step explanation:

The K-Means algorithm is a clustering method used to partition a set of data into K distinct non-overlapping subgroups, where each data point belongs to the cluster with the nearest mean. Starting with initial cluster centers c1=3 and c2=15, and given the dataset {0,2,4,6,24,26}, we will perform the iterations required.

Iteration 1

First, assign each data point to the nearest center:

Cluster 1: {0,2,4,6}

Cluster 2: {24,26}

Now, calculate the new centroids:

c1 = average of {0,2,4,6} = 3

c2 = average of {24,26} = 25

Iteration 2

Reassign each data point to the nearest new center:

Cluster 1: {0,2,4,6}

Cluster 2: {24,26}

The centroids do not change after the second iteration since the clusters remained the same. Therefore, the values of c1 and c2 remain 3 and 25, respectively.

Advantages of K-means over Hierarchical Clustering

K-means is computationally faster for large datasets.

K-means often produces tighter clusters than hierarchical clustering.

Stopping Criteria for K-Means Algorithm

No changes in the cluster assignments.

Centroid positions do not change between iterations.

Fixed number of iterations is reached.

answered
User Edward J Beckett
by
8.3k points
1 vote

Final answer:

After one iteration of the k-means algorithm, c1 becomes 3 and c2 becomes 25. The values of c1 and c2 remain the same after the second iteration. K-means clustering is advantageous over hierarchical clustering in terms of computational efficiency and scalability. Three stopping criteria for the K-means algorithm are convergence, maximum number of iterations, and user-specified threshold.

Step-by-step explanation:

1. What are the values of c1 and c2 after one iteration of k-means?

After one iteration of the k-means algorithm, the new value of c1 will be the mean of the points {0, 2, 4, 6}, which is 3.

The new value of c2 will be the mean of the points {24, 26}, which is 25.

2. What are the values of c1 and c2 after the second iteration of k-means?

In the second iteration, the new value of c1 will be the mean of the points {0, 2, 4, 6}, which is still 3.

The new value of c2 will be the mean of the points {24, 26}, which is still 25.

3. Describe two advantages of K-means clustering over hierarchical clustering.

Computational efficiency: K-means clustering is computationally more efficient than hierarchical clustering, especially when dealing with a large dataset.

Scalability: K-means clustering is more scalable than hierarchical clustering, as it can handle larger datasets more efficiently.

4. List down three (3) stopping criteria for K-Means Algorithm

Convergence: The algorithm stops when the cluster centers no longer change significantly between iterations.

Maximum number of iterations: The algorithm stops after a specified number of iterations, even if the convergence criterion is not met.

User-specified threshold: The algorithm stops when the change in cluster center positions falls below a user-defined threshold.

answered
User Jelle Vergeer
by
8.1k points