2018 Analzying Bias
Analyzing Bias in Machine Learning Research
Communicating research among colleagues, peers, and community at large is bread and butter of researchers. While doing this, it is their responsibility to follow some basic principles like transparency and accuracy. Bias is any deviation from truth in data collection, analysis, interpretation and publication.
Recently, a diverse group of stakeholders representing academia, funding agencies, publishers and librarians have started an effort to design clear and measurable principles for scientific data management and stewardship called FAIR Data Principles. FAIR stands for Findability, Accessibility, Interoperability, and Reusability of data. With more and more researchers following these principles it has been becoming more and more feasible to analyze papers of other scientists. Through being able to access the data and methods that were used to prove particular hypotheses one can repeat and/or reproduce the original experiment from a paper.
This project will focus on analysing bias in machine learning research, based on some small (feasible for a 10 week internship) number of journal papers. The papers will be carefully selected from major machine learning journals and conferences, and from various science and engineering fields.
- Basic statistical data analysis skills are required
- Basic Python programming skills are required
Desirable Skills / Qualifications:
- Understanding p-hacking methods
- Bi-weekly presentation to other internship/REU students. Notre Dame’s Center for Research Computing has an active REU program with participation of over 20 students every summer
- Each analyzed paper will be delivered in the form of a WholeTale container.
- Support from the WholeTale dev team will be provided as needed
- Final poster presentation
Primary Mentor: Jarek Nabrzyski, Center for Research Computing at Notre Dame
Secondary Mentor(s): Jessica Young, Center for Social Research at Notre Dame