Anomaly Detection Handler
The Anomaly Detection handler implements supervised, semi-supervised, and unsupervised anomaly detection algorithms using the pyod, catboost, xgboost, and sklearn libraries. The models were chosen based on the results in the following benchmark paper.
Additional information
-
If no labelled data, we use an unsupervised learner with the syntax
CREATE ANOMALY DETECTION MODEL <model_name>
without specifying the target to predict. MindsDB then adds a column calledoutlier
when generating results. -
If we have labelled data, we use the regular model creation syntax. There is backend logic that chooses between a semi-supervised algorithm (currently XGBOD) vs. a supervised algorithm (currently CatBoost).
-
If multiple models are provided, then we create an ensemble and use majority voting.
-
See the anomaly detection proposal document for more information.
Example usage
To run example queries, use the data from this CSV file.
Unsupervised detection
Semi-supervised detection
Supervised detection
Specific model
Specific anomaly type
Ensemble
Additional Media:
Demo 1:
https://www.loom.com/share/0996e5faa3f7415bacd51a6e8e161d5e?sid=9bacd29a-975b-4a94-b081-de2255b93607