Reproducibility-by-Design
Experiments are the most important tool to test the validity of scientific hypotheses. Validation requires multiple independent reproductions of the experiment. Making reproducibility a critical feature of scientific experiments. Therefore, reproducibility is an integral part of the SLICES research infrastructure (RI) design. The ACM defines reproducibility as a three-stage process:
- Repeatability, when the same team recreates results on the same RI
- Reproducibility, when a different team recreates results on the same RI
- Replicability, when a team recreates results using their own infrastructure
Creating reproducible experiments requires additional effort by researchers to properly document and prepare experimental artifacts. This is one of the reasons why experimental data may not be released or is poorly documented. SLICES tries to solve this issue by providing researchers with additional tools that lower the effort to create reproducible experiments.
We identified two key factors to create reproducible experiments: live images and automation. The pos workflow relies entirely on Linux live images. For each experiment, all experiment nodes boot a Linux live image, providing the same, consistently clean environment for an experiment after each reboot. At the same time, researchers know that the system configuration must be automated because of the reset at the beginning of each experiment. The experiment workflow also requires the experiments to be fully automated, i.e., each step of the experiment is part of an experiment script. Because of this high degree of automation, either the researchers themselves can recreate the experiments by simply re-running the experiment scripts (achieving repeatability). If access to the RI is shared, other researchers can also recreate the experiments, achieving reproducibility.
Adhering to the pos workflow creates experiments that are inherently repeatable and reproducible. We call this property reproducibility-by-design because the researchers get to the second stage (reproducibility) without spending additional effort. Creating replicable experiments is out of the control for the individual researcher, however, experiments that adhere to the pos workflow document all necessary information within their scripts to providing the basis for independent replication.
If you are interested in the pos workflow, you can have a look at our paper that provides more information on the pos workflow and how it can be integrated into testbeds have a look at our website [1] and our paper [2].
[1] http://gallenmu.github.io/pos-artifacts
[2] S. Gallenmüller, D. Scholz, H. Stubbe, and G. Carle, “The pos Framework: a Methodology and Toolchain for Reproducible Network Experiments,” in ACM CoNEXT‘21, https://dl.acm.org/doi/10.1145/3485983.3494841