PSKC
ikpykit.cluster.PSKC ¶
PSKC(
n_estimators=200,
max_samples="auto",
method="inne",
tau=0.1,
v=0.1,
random_state=None,
)
Bases: BaseEstimator
, ClusterMixin
Point-Set Kernel Clustering algorithm using Isolation Kernels.
PSKC is a clustering algorithm that leverages Isolation Kernels to create feature vector representations of data points. It adaptively captures the characteristics of local data distributions by using data-dependent kernels. The algorithm forms clusters by identifying points with high similarity in the transformed kernel space.
The clustering process works by iteratively: 1. Selecting a center point with maximum similarity to the mean 2. Forming a cluster around this center 3. Removing these points from consideration 4. Continuing until stopping criteria are met
n_estimators : int, default=200 The number of base estimators (trees) in the isolation ensemble.
max_samples : int or str, default="auto"
- If int, then draw max_samples
samples.
- If "auto", then max_samples=min(256, n_samples)
.
method : {'inne', 'anne'}, default='inne' The method used for building the isolation kernel.
tau : float, default=0.1 Lower values result in more clusters.
v : float, default=0.1 The decay factor for reducing the similarity threshold. Controls the expansion of clusters.
Controls the pseudo-randomness of the algorithm for reproducibility.
Pass an int for reproducible results across multiple function calls.
Attributes clusters_ : list List of KCluster objects representing the identified clusters.
labels_ : ndarray of shape (n_samples,) Cluster labels for each point in the dataset.
centers : list Centers of each cluster in the transformed feature space.
n_classes : int Number of clusters found.
Examples:
>>> from ikpykit.cluster import PSKC
>>> import numpy as np
>>> X = np.array([[1, 2], [1, 4], [10, 2], [10, 10], [1, 0], [1, 1]])
>>> pskc = PSKC(n_estimators=100, max_samples=2, tau=0.3, v=0.1, random_state=24)
>>> pskc.fit_predict(X)
array([0, 0, 1, 1, 0, 0])
References
.. [1] Kai Ming Ting, Jonathan R. Wells, Ye Zhu (2023) "Point-set Kernel Clustering". IEEE Transactions on Knowledge and Data Engineering. Vol.35, 5147-5158.
Source code in ikpykit/cluster/_pskc.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
|
fit ¶
fit(X, y=None)
Fit the model on data X.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
np.array of shape (n_samples, n_features)
|
The input instances. |
required |
Returns:
Name | Type | Description |
---|---|---|
self |
object
|
|
Source code in ikpykit/cluster/_pskc.py
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
|