TIDKC
ikpykit.trajectory.TIDKC ¶
TIDKC(
k,
kn,
v,
n_init_samples,
n_estimators_1=100,
max_samples_1="auto",
n_estimators_2=100,
max_samples_2="auto",
method="anne",
is_post_process=True,
random_state=None,
)
Bases: BaseEstimator
, ClusterMixin
Trajectory Isolation Distributional Kernel Clustering (TIDKC).
TIDKC identifies non-linearly separable clusters with irregular shapes and varied densities in trajectory data using distributional kernels. It operates in linear time, does not rely on random initialization, and is robust to outliers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
k |
int
|
The number of clusters to form. |
required |
kn |
int
|
The number of nearest neighbors to consider when calculating the local contrast. |
required |
v |
float
|
The decay factor for reducing the threshold value. |
required |
n_init_samples |
int
|
The number of samples to use for initializing the cluster centers. |
required |
n_estimators_1 |
int
|
Number of base estimators in the first step ensemble. |
100
|
max_samples_1 |
(int, float or auto)
|
Number of samples to draw for training each base estimator in first step:
- If int, draws exactly |
"auto"
|
n_estimators_2 |
int
|
Number of base estimators in the second step ensemble. |
100
|
max_samples_2 |
(int, float or auto)
|
Number of samples to draw for training each base estimator in second step:
- If int, draws exactly |
"auto"
|
method |
(inne, anne)
|
Isolation method to use. "anne" is the original algorithm from the paper. |
"inne"
|
is_post_process |
bool
|
Whether to perform post-processing to refine the clusters. |
True
|
random_state |
int, RandomState instance or None
|
Controls the pseudo-randomness of the selection of the feature and split values for each branching step and each tree in the forest. |
None
|
Attributes:
Name | Type | Description |
---|---|---|
labels_ |
ndarray of shape (n_samples,)
|
Cluster labels for each point in the dataset. |
iso_kernel_ |
IsoKernel
|
The fitted isolation kernel. |
idkc_ |
IDKC
|
The fitted IDKC clustering model. |
References
.. [1] Z. J. Wang, Y. Zhu and K. M. Ting, "Distribution-Based Trajectory Clustering," 2023 IEEE International Conference on Data Mining (ICDM).
Examples:
>>> from ikpykit.trajectory import TIDKC
>>> from ikpykit.trajectory.dataloader import SheepDogs
>>> sheepdogs = SheepDogs()
>>> X, y = sheepdogs.load(return_X_y=True)
>>> clf = TIDKC(k=2, kn=5, v=0.5, n_init_samples=10).fit(X)
>>> predictions = clf.fit_predict(X)
Source code in ikpykit/trajectory/cluster/_tidkc.py
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
|
fit ¶
fit(X, y=None)
Fit the trajectory cluster model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array-like of shape (n_trajectories, n_points, n_features)
|
The input trajectories to train on. |
required |
y |
Ignored
|
Not used, present for API consistency. |
None
|
Returns:
Name | Type | Description |
---|---|---|
self |
object
|
Fitted estimator. |
Raises:
Type | Description |
---|---|
ValueError
|
If method is not valid. |
Source code in ikpykit/trajectory/cluster/_tidkc.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
|
fit_predict ¶
fit_predict(X, y=None)
Fit the model and predict clusters for X.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array-like of shape (n_trajectories, n_points, n_features)
|
The input trajectories. |
required |
y |
Ignored
|
Not used, present for API consistency. |
None
|
Returns:
Name | Type | Description |
---|---|---|
labels |
ndarray of shape (n_samples,)
|
Cluster labels. |
Source code in ikpykit/trajectory/cluster/_tidkc.py
213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
|