Use Whole Tale to create and publish your own transparent and reproducible research.

Explore existing reproducible research created using Whole Tale.

Learn more about Whole Tale, an open source platform for reproducible research.

What is Whole Tale?

Whole Tale is an NSF-funded Data Infrastructure Building Block (DIBBS) initiative to build a scalable, open source, web-based, multi-user platform for reproducible research enabling the creation, publication, and execution of tales - executable research objects that capture data, code, and the complete software environment used to produce research findings.

A beta version of the system is available at https://dashboard.wholetale.org.

The Whole Tale platform has been designed based on community input primarily through working groups and collaborations with researchers.

The Whole Tale project is involved in several initiatives to train researchers for reproducibility as well as use of Whole Tale in the classroom.

Why Whole Tale?

Virtually all published discoveries today have data and computational components. There is a mismatch between traditional scientific dissemination practices and modern computational research practice that leads to reproducibility concerns. The Whole Tale platform supports computational reproducibility by enabling researchers to create and package code, data and information about the workflow and computational environment necessary to support review and reproduce results of computational analysis that are reported in published research. Whole Tale implements this definition by supporting explicit citation of externally referenced data, capturing the artifacts and provenance information needed to facilitate understanding, transparency, and execution of the computational processes and workflows used for review and reproducibility at the time of publication.

What is a Tale?

A tale is an executable research object that combines data (references), code (computational methods), computational environment, and narrative (traditional science story). Tales are captured in a standards-based format complete with metadata. The Tale stores explicit references to data and code used in computational experiments, both for reproducibility purposes and to permit the citation of the specific versions used in any subsequent research. A Tale can be submitted (e.g. published) to an external research repository and assigned a persistent identifier by the repository. The Whole Tale platform allows users to interactively create and edit Tales and to re-run a Tale to reproduce and verify results as obtained by the original Tale creator.

Featured Tales

LIGO Tutorial

LIGO Detected Gravitational Waves from Black Holes On September 14, 2015 at 5:51 a.m. Eastern Daylight Time (09:51 UTC), the twin Laser Interferometer Gravitational-wave Observatory (LIGO) detectors, located in Livingston, Louisiana, and Hanford, Washington, USA both measured ripples in the fabric of spacetime - gravitational waves - arriving at the Earth from a cataclysmic event in the distant universe.

Informatics-aided bandgap engineering for solar material

Reproducing "Informatics-aided bandgap engineering for solar materials" This notebook shows how to replicate the main findings of a 2014 paper by Dey et al that used machine learning to predict the band gap energies of solar cell materials

Predicting the Properties of Inorganic Materials with Machine Learning

properties of materials. The main focus of this paper was the construction of general purpose method to link the composition of a material (i.e., the fractions of each element) to its properties, which they found can be used for as applications such as identifying candidate solar cell materials. The notebooks within this tale recreate the validation tests from the paper and how the models were used to discover new materials.

Replication of a classical - ecological niche model

MaxEnt modeling for species distribution. In 2006, Phillps, Anderson and Shapire revisited estimation methods for probability density functions in order to improve predictions for species presence in a given location. This Tale replicates the logic and general results of the paper using the same datasets in a Jupyter Notebook. In addition to this Tale, one that uses the scikit-learn Python package for data analysis, a second Tale is under development using intros-MaxEnt, a packaged aimed at ecologists with better control structures.

Long-term Ecological Research Panoche Hills Ecological Reserve

This Tale explores Ephedra california as a foundation species within the San Joaquin Desert, California. These data specific to Panoche Hills Ecological Reserve. Meta-data provided with published data, and most measures are self-evident and standard protocols for research team. However, canopy decadence is not the typical Likert net 1-10 scale. It can vary from 1-10, but this score is estimated by breaking each canopy, visually into 5 segments, the four cardinal directions and top, and scoring each as 0 for dead, 1 partial, and 2 for all alive.

Why-Not-What-If-Prov of ASP computation within ASP using PWE

We demonstrate using a ASP encoding for recording provenance of operations in computation of a simple ASP.


If you are using Whole Tale or planning to try it, join the our Slack workspace or subscribe to the wholetale email list to hear what other users are doing, ask questions, and listen in on and join the community discussion.

If you would like to suggest new features, or participate in the discussion of implementation issues, please follow the project on GitHub .