Analyzing Bias in Machine Learning Research

Projection Description

Communicating research among colleagues, peers, and community at large is bread and butter of researchers. While doing this, it is their responsibility to follow some basic principles like transparency and accuracy. Bias is any deviation from truth in data collection, analysis, interpretation and publication.

Recently, a diverse group of stakeholders representing academia, funding agencies, publishers and librarians have started an effort to design clear and measurable principles for scientific data management and stewardship called FAIR Data Principles. FAIR stands for Findability, Accessibility, Interoperability, and Reusability of data. With more and more researchers following these principles it has been becoming more and more feasible to analyze papers of other scientists. Through being able to access the data and methods that were used to prove particular hypotheses one can repeat and/or reproduce the original experiment from a paper.

This project will focus on analysing bias in machine learning research, based on some small (feasible for a 10 week internship) number of journal papers. The papers will be carefully selected from major machine learning journals and conferences, and from various science and engineering fields.

Necessary Prerequisites:

Desirable Skills / Qualifications:

Expected Outcomes:

Primary Mentor: Jarek Nabrzyski, Center for Research Computing at Notre Dame

Secondary Mentor(s): Jessica Young, Center for Social Research at Notre Dame