IKTOD
ikpykit.timeseries.IKTOD ¶
IKTOD(
n_estimators_1=100,
max_samples_1="auto",
n_estimators_2=100,
max_samples_2="auto",
method="inne",
period_length=10,
contamination="auto",
random_state=None,
)
Bases: OutlierMixin
, BaseEstimator
Isolation Kernel-based Time series Subsequnce Anomaly Detection.
IKTOD implements a distribution-based approach for anomaly time series subsequence detection. Unlike traditional time or frequency domain approaches that rely on sliding windows, IKTOD treats time series subsequences as distributions in R domain, enabling more effective similarity measurements with linear time complexity.
This approach uses Isolation Distributional Kernel (IDK) to measure similarities between subsequences, resulting in better detection accuracy compared to sliding-window-based detectors.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_estimators_1 |
int
|
Number of base estimators in the first-level ensemble. |
100
|
max_samples_1 |
int, float, or "auto"
|
Number of samples for training each first-level base estimator:
- int: exactly |
"auto"
|
n_estimators_2 |
int
|
Number of base estimators in the second-level ensemble. |
100
|
max_samples_2 |
int, float, or "auto"
|
Number of samples for training each second-level base estimator:
- int: exactly |
"auto"
|
method |
(inne, anne)
|
Isolation method to use: - "inne": original Isolation Forest approach - "anne": approximate nearest neighbor ensemble |
"inne"
|
period_length |
int
|
Length of subsequences to split the time series. |
10
|
contamination |
auto or float
|
Proportion of outliers in the dataset: - "auto": threshold determined as in the original paper - float: must be in range (0, 0.5] |
"auto"
|
random_state |
int, RandomState instance or None
|
Controls randomization for reproducibility. |
None
|
Attributes:
Name | Type | Description |
---|---|---|
ikgad_ |
IKGAD
|
Trained Isolation Kernel Group Anomaly Detector. |
offset_ |
float
|
Decision threshold for outlier detection. |
is_fitted_ |
bool
|
Indicates if the model has been fitted. |
References
.. [1] Ting, K.M., Liu, Z., Zhang, H., Zhu, Y. (2022). A New Distributional Treatment for Time Series and An Anomaly Detection Investigation. Proceedings of The Very Large Data Bases (VLDB) Conference.
Examples:
>>> from ikpykit.timeseries import IKTOD
>>> import numpy as np
>>> # Time series with length 40 (4 periods of length 10)
>>> X = np.sin(np.linspace(0, 8*np.pi, 40)).reshape(-1, 1)
>>> # Add anomaly
>>> X[25:30] = X[25:30] + 5.0
>>> detector = IKTOD(max_samples_1=2, max_samples_2=2, contamination=0.1, random_state=42)
>>> detector = detector.fit(X)
>>> detector.predict(X)
array([ 1, 1, -1, 1])
Source code in ikpykit/timeseries/anomaly/_iktod.py
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
|
fit ¶
fit(X)
Fit the IKTOD model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array-like of shape (n_samples, n_features)
|
Input time series data where: - n_samples: length of the time series - n_features: number of variables (default 1 for univariate) |
required |
Returns:
Name | Type | Description |
---|---|---|
self |
object
|
Fitted estimator. |
Raises:
Type | Description |
---|---|
ValueError
|
If time series length is less than or equal to period_length. |
Source code in ikpykit/timeseries/anomaly/_iktod.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
|
predict ¶
predict(X)
Predict if subsequences contain outliers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array-like of shape (n_samples, n_features)
|
Time series data to evaluate |
required |
Returns:
Name | Type | Description |
---|---|---|
labels |
ndarray of shape (n_subsequences,)
|
Returns +1 for inliers and -1 for outliers for each subsequence. |
Source code in ikpykit/timeseries/anomaly/_iktod.py
223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 |
|
decision_function ¶
decision_function(X)
Compute decision scores for subsequences.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array-like of shape (n_samples, n_features)
|
Time series data to evaluate |
required |
Returns:
Name | Type | Description |
---|---|---|
scores |
ndarray of shape (n_subsequences,)
|
Decision scores. Negative scores represent outliers, positive scores represent inliers. |
Source code in ikpykit/timeseries/anomaly/_iktod.py
241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 |
|
score_samples ¶
score_samples(X)
Compute anomaly scores for subsequences.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
array-like of shape (n_samples, n_features)
|
Time series data to evaluate |
required |
Returns:
Name | Type | Description |
---|---|---|
scores |
ndarray of shape (n_subsequences,)
|
Anomaly scores where lower values indicate more anomalous subsequences. |
Source code in ikpykit/timeseries/anomaly/_iktod.py
257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 |
|