Machine Learning Blog
Machine Learning Blog
2013
ML is driven by case studies and datasets. An overview:
Time-series:
• Economic: http://www.economicswebinstitute.org/ecdata.htm
• Industrial: http://homes.esat.kuleuven.be/~smc/daisy/daisydata.html
• TSDL: http://robjhyndman.com/TSDL/
• UK data: http://data.gov.uk/about
• EEG: http://sccn.ucsd.edu/~arno/fam2data/publicly_available_EEG_data.html
• Mike West: http://www.stat.duke.edu/~mw/ts_data_sets.html
• UWO: http://www.stats.uwo.ca/faculty/aim/epubs/datasets/default.htm
Data mining:
• MLdata: http://mldata.org/
• UCI data: http://archive.ics.uci.edu/ml/index.html
• MLDATA: http://mldata.org/
• INEX: http://inex.otago.ac.nz/, http://webspam.lip6.fr/
• Clopinet: http://clopinet.com/challenges/
• KD nuggets: http://www.kdnuggets.com/datasets/competitions.html
• Delicious: http://www.delicious.com/pskomoroch/dataset,
http://www.datawrangling.com/some-datasets-available-on-the-web
• Datamob: http://datamob.org
• Ranking: http://learningtorankchallenge.yahoo.com/, http://research.microsoft.com/en-us/projects/mslr/
• ed.ac.uk: http://www.inf.ed.ac.uk/teaching/courses/dme/html/datasets0405.html
• Million Song: http://labrosa.ee.columbia.edu/millionsong/
• Nokia: http://research.nokia.com/mdc
• Yandex: http://imat-relpred.yandex.ru/en
• kaggle: http://www.kaggle.com/
• Mindboggle: http://mindboggle.info/index.html
• CAMrA: http://2011.camrachallenge.com/
• Statistical Machine Translation: http://www.statmt.org/
BioMed:
• Statlib: http://lib.stat.cmu.edu/datasets/
• StatSci: http://www.statsci.org/datasets.html
• Klein book: http://www.mcw.edu/biostatistics/Faculty/Faculty/JohnPKleinPhD/SurvivalAnalysisBook/DataSetsBothEditions.htm
• PhysioMed: http://physionet.caregroup.harvard.edu/physiobank/database/
• PhysioNet: http://www.physionet.org/challenge/
• GLIMs: http://www.sci.usq.edu.au/staff/dunn/Datasets/tech-glms.html
2/4/13
ML - datasets