About Clustering Models

A clustering model is an unsupervised learning algorithm that groups similar objects or similar attributes. For example, if you want to identify an operation in a production process, or attributes that have similar behaviour.

K-Means
Anchor
K-Means
K-Means

K-means clustering is a method which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. This results in a partitioning of the data space into Voronio cells.

To launch this model tool, select Models > K-Means from the menu. Alternatively, click the corresponding icon in the sidebar.

Tip

title	Type of variable

K-Means models can only be created with numerical attributes.

Tip

title	Create attribute set

It is possible to create a new attribute set using the input of the model. Click the icon

.

Create a K-Means

The parameters for this method are defined as follows:

Select a Data source from the list.
Enter a name for your model. The default prefix is "CLUSTER-".
Select a Learning set from the list.Select a Test set from the list.
Select attribute(s) from the list for the Input.
Enter a name for your model. The default prefix is "CLUSTER-".
Enter a Cluster number, default 3.
Select Cluster silhouette.
Select Search Cluster number.
Click Compute to generate the clusterclusters.

Tip

title	Visualize K-Means results

To visualize the K-Means results use the scatter plot, choose the attributes for x and y axis and then put the condition as the Nearest CLUSTER-Name.

Subclu

Subclu is an unsupervised clustering algorithm used to define groups or patterns with the data based on the density of data points. It marks as outlier’s points that lie alone in low-density regions. Each cluster is expanded one dimension at a time into a dimension that is known to have a cluster that only differs from previous clusters in one dimension. Therefore, it is not necessary to define the number of clusters as in k-Means.

Create Subclu

The parameters for this method are defined as follows:

Select a Data source from the list.
Select a Learning set from the list.
Select attribute(s) from the list for the Input.
Enter Cluster name prefix. The default prefix is ("CLUSTER-").
Enter a Maximum number of points, default 10.
Enter Epsilon, default 0.1.
Select Cluster silhouette. yes or no.
Click Compute to generate the clusters.

Hierarchical Clustering

Hierarchical clustering is a model that is viewed as a dendrogram. A dendrogram is a tree diagram frequently used to illustrate the arrangement of the clusters produced by hierarchical clustering. For more information, see Dendrograms.

Versions Compared

Old Version 15

New Version 16

Key

About Clustering Models

K-Means
Anchor
K-Means
K-Means

Create a K-Means

Subclu

Create Subclu

Hierarchical Clustering

Page Comparison

Versions Compared

Old Version 15

New Version 16

Key

About Clustering Models

K-Means AnchorK-MeansK-Means

Create a K-Means

Subclu

Create Subclu

Hierarchical Clustering

K-Means
Anchor
K-Means
K-Means