This class implements the Early Drift Detection Method (EDDM), designed to detect concept drifts in online learning scenarios by monitoring the distances between consecutive errors. EDDM is particularly useful for detecting gradual drifts earlier than abrupt changes.
Details
EDDM is a statistical process control method that is more sensitive to changes that happen more slowly and can provide early warnings of deterioration before the error rate increases significantly.
References
Early Drift Detection Method. Manuel Baena-Garcia, Jose Del Campo-Avila, Raúl Fidalgo, Albert Bifet, Ricard Gavalda, Rafael Morales-Bueno. In Fourth International Workshop on Knowledge Discovery from Data Streams, 2006.
Implementation: https://github.com/scikit-multiflow/scikit-multiflow/blob/a7e316d1cc79988a6df40da35312e00f6c4eabb2/src/skmultiflow/drift_detection/eddm.py
Public fields
eddm_warningWarning threshold setting.
eddm_outcontrolOut-of-control threshold setting.
m_num_errorsCurrent number of errors encountered.
m_min_num_errorsMinimum number of errors to initialize drift detection.
m_nTotal instances processed.
m_dDistance to the last error from the current instance.
m_lastdDistance to the previous error from the last error.
m_meanMean of the distances between errors.
m_std_tempTemporary standard deviation accumulator for the distances.
m_m2s_maxMaximum mean plus two standard deviations observed.
delayDelay count since the last detected change.
estimationCurrent estimated mean distance between errors.
warning_detectedBoolean indicating if a warning has been detected.
change_detectedBoolean indicating if a change has been detected.
Methods
Method new()
Initializes the EDDM detector with specific parameters.
Usage
EDDM$new(min_num_instances = 30, eddm_warning = 0.95, eddm_outcontrol = 0.9)Examples
set.seed(123) # Setting a seed for reproducibility
data_part1 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.7, 0.3))
# Introduce a change in data distribution
data_part2 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.3, 0.7))
# Combine the two parts
data_stream <- c(data_part1, data_part2)
eddm <- EDDM$new()
for (i in 1:length(data_stream)) {
eddm$add_element(data_stream[i])
if (eddm$change_detected) {
message(paste("Drift detected!",i))
} else if (eddm$warning_detected) {
message(paste("Warning detected!",i))
}
}
#> Warning detected! 108
#> Warning detected! 109
#> Warning detected! 110
#> Warning detected! 111
#> Warning detected! 112
#> Warning detected! 113
#> Warning detected! 114
#> Warning detected! 115
#> Drift detected! 116
#> Warning detected! 157
#> Warning detected! 158
#> Warning detected! 159
#> Warning detected! 160
#> Warning detected! 161
#> Warning detected! 162
#> Drift detected! 163
