.. _example_covariance_plot_outlier_detection.py: ========================================== Outlier detection with several methods. ========================================== When the amount of contamination is known, this example illustrates three different ways of performing :ref:`outlier_detection`: - based on a robust estimator of covariance, which is assuming that the data are Gaussian distributed and performs better than the One-Class SVM in that case. - using the One-Class SVM and its ability to capture the shape of the data set, hence performing better when the data is strongly non-Gaussian, i.e. with two well-separated clusters; - using the Isolation Forest algorithm, which is based on random forests and hence more adapted to large-dimensional settings, even if it performs quite well in the examples below. The ground truth about inliers and outliers is given by the points colors while the orange-filled area indicates which points are reported as inliers by each method. Here, we assume that we know the fraction of outliers in the datasets. Thus rather than using the 'predict' method of the objects, we set the threshold on the decision_function to separate out the corresponding fraction. .. rst-class:: horizontal * .. image:: images/plot_outlier_detection_001.png :scale: 47 * .. image:: images/plot_outlier_detection_002.png :scale: 47 * .. image:: images/plot_outlier_detection_003.png :scale: 47 **Python source code:** :download:`plot_outlier_detection.py ` .. literalinclude:: plot_outlier_detection.py :lines: 30- **Total running time of the example:** 17.71 seconds ( 0 minutes 17.71 seconds)