Fork me on GitHub

About us

This is a community effort, and as such many people have contributed to it over the years.

History

This project was started in 2007 as a Google Summer of Code project by David Cournapeau. Later that year, Matthieu Brucher started work on this project as part of his thesis.

In 2010 Fabian Pedregosa, Gael Varoquaux, Alexandre Gramfort and Vincent Michel of INRIA took leadership of the project and made the first public release, February the 1st 2010. Since then, several releases have appeared following a ~3 month cycle, and a thriving international community has been leading the development.

People

Citing scikit-learn

If you use scikit-learn in a scientific publication, we would appreciate citations to the following paper:

Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.

Bibtex entry:

@article{scikit-learn,
 title={Scikit-learn: Machine Learning in {P}ython},
 author={Pedregosa, F. and Varoquaux, G. and Gramfort, A. and Michel, V.
         and Thirion, B. and Grisel, O. and Blondel, M. and Prettenhofer, P.
         and Weiss, R. and Dubourg, V. and Vanderplas, J. and Passos, A. and
         Cournapeau, D. and Brucher, M. and Perrot, M. and Duchesnay, E.},
 journal={Journal of Machine Learning Research},
 volume={12},
 pages={2825--2830},
 year={2011}
}

If you want to cite scikit-learn for its API or design, you may also want to consider the following paper:

API design for machine learning software: experiences from the scikit-learn project, Buitinck et al., 2013.

Bibtex entry:

@inproceedings{sklearn_api,
  author    = {Lars Buitinck and Gilles Louppe and Mathieu Blondel and
               Fabian Pedregosa and Andreas Mueller and Olivier Grisel and
               Vlad Niculae and Peter Prettenhofer and Alexandre Gramfort
               and Jaques Grobler and Robert Layton and Jake VanderPlas and
               Arnaud Joly and Brian Holt and Ga{\"{e}}l Varoquaux},
  title     = {{API} design for machine learning software: experiences from the scikit-learn
               project},
  booktitle = {ECML PKDD Workshop: Languages for Data Mining and Machine Learning},
  year      = {2013},
  pages = {108--122},
}

Artwork

High quality PNG and SVG logos are available in the doc/logos/ source directory.

_images/scikit-learn-logo-notext.png

Funding

INRIA actively supports this project. It has provided funding for Fabian Pedregosa (2010-2012), Jaques Grobler (2012-2013) and Olivier Grisel (2013-2015) to work on this project full-time. It also hosts coding sprints and other events.

_images/inria-logo.jpg

Paris-Saclay Center for Data Science funded one year for a developer to work on the project full-time (2014-2015).

_images/cds-logo.png

NYU Moore-Sloan Data Science Environment funds Andreas Mueller (2014-2015) to work on this project. The Moore-Sloan Data Science Environment also funds several students to work on the project part-time.

_images/nyu_short_color.png

The following students were sponsored by Google to work on scikit-learn through the Google Summer of Code program.

  • 2007 - David Cournapeau
  • 2011 - Vlad Niculae
  • 2012 - Vlad Niculae, Immanuel Bayer.
  • 2013 - Kemal Eren, Nicolas Trésegnie
  • 2014 - Hamzeh Alsalhi, Issam Laradji, Maheshakya Wijewardena, Manoj Kumar.

It also provided funding for sprints and events around scikit-learn. If you would like to participate in the next Google Summer of code program, please see this page

The NeuroDebian project providing Debian packaging and contributions is supported by Dr. James V. Haxby (Dartmouth College).

The PSF helped find and manage funding for our 2011 Granada sprint. More information can be found here

tinyclues funded the 2011 international Granada sprint.

Donating to the project

If you are interested in donating to the project or to one of our code-sprints, you can use the Paypal button below or the NumFOCUS Donations Page (if you use the latter, please indicate that you are donating for the scikit-learn project).

All donations will be handled by NumFOCUS, a non-profit-organization which is managed by a board of Scipy community members. NumFOCUS’s mission is to foster scientific computing software, in particular in Python. As a fiscal home of scikit-learn, it ensures that money is available when needed to keep the project funded and available while in compliance with tax regulations.

The received donations for the scikit-learn project mostly will go towards covering travel-expenses for code sprints, as well as towards the organization budget of the project [1].




Notes

[1]Regarding the organization budget in particular, we might use some of the donated funds to pay for other project expenses such as DNS, hosting or continuous integration services.

The 2013’ Paris international sprint

telecom tinyclues afpy FNRS

http://sites.uclouvain.be/dysco/pmwiki/uploads/Main/dysco.gif

IAP VII/19 - DYSCO

For more information on this sprint, see here

Infrastructure support

  • We would like to thank Rackspace for providing us with a free Rackspace Cloud account to automatically build the documentation and the example gallery from for the development version of scikit-learn using this tool.
  • We would also like to thank Shining Panda for free CPU time on their Continuous Integration server.
Previous