.. _example_applications_plot_model_complexity_influence.py:


==========================
Model Complexity Influence
==========================

Demonstrate how model complexity influences both prediction accuracy and
computational performance.

The dataset is the Boston Housing dataset (resp. 20 Newsgroups) for
regression (resp. classification).

For each class of models we make the model complexity vary through the choice
of relevant model parameters and measure the influence on both computational
performance (latency) and predictive power (MSE or Hamming Loss).


.. rst-class:: horizontal


    *

      .. image:: images/plot_model_complexity_influence_001.png
            :scale: 47

    *

      .. image:: images/plot_model_complexity_influence_002.png
            :scale: 47

    *

      .. image:: images/plot_model_complexity_influence_003.png
            :scale: 47


**Script output**::

  Benchmarking SGDClassifier(alpha=0.001, average=False, class_weight=None, epsilon=0.1,
         eta0=0.0, fit_intercept=True, l1_ratio=0.25,
         learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1,
         penalty='elasticnet', power_t=0.5, random_state=None, shuffle=True,
         verbose=0, warm_start=False)
  Complexity: 4454 | Hamming Loss (Misclassification Ratio): 0.2501 | Pred. Time: 0.025987s
  
  Benchmarking SGDClassifier(alpha=0.001, average=False, class_weight=None, epsilon=0.1,
         eta0=0.0, fit_intercept=True, l1_ratio=0.5, learning_rate='optimal',
         loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet',
         power_t=0.5, random_state=None, shuffle=True, verbose=0,
         warm_start=False)
  Complexity: 1624 | Hamming Loss (Misclassification Ratio): 0.2923 | Pred. Time: 0.019091s
  
  Benchmarking SGDClassifier(alpha=0.001, average=False, class_weight=None, epsilon=0.1,
         eta0=0.0, fit_intercept=True, l1_ratio=0.75,
         learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1,
         penalty='elasticnet', power_t=0.5, random_state=None, shuffle=True,
         verbose=0, warm_start=False)
  Complexity: 873 | Hamming Loss (Misclassification Ratio): 0.3191 | Pred. Time: 0.015938s
  
  Benchmarking SGDClassifier(alpha=0.001, average=False, class_weight=None, epsilon=0.1,
         eta0=0.0, fit_intercept=True, l1_ratio=0.9, learning_rate='optimal',
         loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet',
         power_t=0.5, random_state=None, shuffle=True, verbose=0,
         warm_start=False)
  Complexity: 655 | Hamming Loss (Misclassification Ratio): 0.3252 | Pred. Time: 0.013293s
  
  Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05,
     kernel='rbf', max_iter=-1, nu=0.1, shrinking=True, tol=0.001,
     verbose=False)
  Complexity: 69 | MSE: 31.8133 | Pred. Time: 0.000406s
  
  Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05,
     kernel='rbf', max_iter=-1, nu=0.25, shrinking=True, tol=0.001,
     verbose=False)
  Complexity: 136 | MSE: 25.6140 | Pred. Time: 0.000719s
  
  Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05,
     kernel='rbf', max_iter=-1, nu=0.5, shrinking=True, tol=0.001,
     verbose=False)
  Complexity: 243 | MSE: 22.3315 | Pred. Time: 0.001228s
  
  Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05,
     kernel='rbf', max_iter=-1, nu=0.75, shrinking=True, tol=0.001,
     verbose=False)
  Complexity: 350 | MSE: 21.3679 | Pred. Time: 0.001736s
  
  Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05,
     kernel='rbf', max_iter=-1, nu=0.9, shrinking=True, tol=0.001,
     verbose=False)
  Complexity: 404 | MSE: 21.0915 | Pred. Time: 0.002007s
  
  Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls',
               max_depth=3, max_features=None, max_leaf_nodes=None,
               min_samples_leaf=1, min_samples_split=2,
               min_weight_fraction_leaf=0.0, n_estimators=10, presort='auto',
               random_state=None, subsample=1.0, verbose=0, warm_start=False)
  Complexity: 10 | MSE: 29.7323 | Pred. Time: 0.000123s
  
  Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls',
               max_depth=3, max_features=None, max_leaf_nodes=None,
               min_samples_leaf=1, min_samples_split=2,
               min_weight_fraction_leaf=0.0, n_estimators=50, presort='auto',
               random_state=None, subsample=1.0, verbose=0, warm_start=False)
  Complexity: 50 | MSE: 8.3021 | Pred. Time: 0.000211s
  
  Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls',
               max_depth=3, max_features=None, max_leaf_nodes=None,
               min_samples_leaf=1, min_samples_split=2,
               min_weight_fraction_leaf=0.0, n_estimators=100,
               presort='auto', random_state=None, subsample=1.0, verbose=0,
               warm_start=False)
  Complexity: 100 | MSE: 7.0518 | Pred. Time: 0.000296s
  
  Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls',
               max_depth=3, max_features=None, max_leaf_nodes=None,
               min_samples_leaf=1, min_samples_split=2,
               min_weight_fraction_leaf=0.0, n_estimators=200,
               presort='auto', random_state=None, subsample=1.0, verbose=0,
               warm_start=False)
  Complexity: 200 | MSE: 6.1267 | Pred. Time: 0.000464s
  
  Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls',
               max_depth=3, max_features=None, max_leaf_nodes=None,
               min_samples_leaf=1, min_samples_split=2,
               min_weight_fraction_leaf=0.0, n_estimators=500,
               presort='auto', random_state=None, subsample=1.0, verbose=0,
               warm_start=False)
  Complexity: 500 | MSE: 6.3365 | Pred. Time: 0.000993s


**Python source code:** :download:`plot_model_complexity_influence.py <plot_model_complexity_influence.py>`

.. literalinclude:: plot_model_complexity_influence.py
    :lines: 16-

**Total running time of the example:**  60.29 seconds
( 1 minutes  0.29 seconds)