Tutorial reproducibility
Please fill out this survey on background and skills, to provide us with information on who you are. It will help us improve the presentation, and make it more relevant for you.
https://cornell.yul1.qualtrics.com/jfe/form/SV_bBqbJ9cSSJdOBw2
One of the following (or a linear combination):
Results from Survey 1
Last updated: 05 September 2025, 20:40
Education
Degree | Frequency | Frequency (Complete) | Percent |
---|---|---|---|
graduate student (Ph.D.) | 15 | 15 | 88.24 |
graduate student (Masters) | 1 | 1 | 5.88 |
faculty member (tenured) | 1 | 1 | 5.88 |
Note: | |||
Percent is calculated as the proportion of completed responses. |
Operating System
OS | Frequency | Percent |
---|---|---|
Linux | 1 | 5.88 |
Mac | 14 | 82.35 |
Windows | 2 | 11.76 |
Note: | ||
Completed responses only. Multiple mentions possible. |
Command Line Usage
Feature | Frequency | Percent |
---|---|---|
Command line used | 0 | 0 |
System with more than 6 CPUs | 0 | 0 |
Note: | ||
Completed responses only. |
One of the following (or a linear combination):
- Reproducibility when some data are confidential
- Advanced self-checking of reproducibility
- Preserving raw survey data early in research lifecycle (ethically!)
Survey 2
To identify what we should be speaking about on Day 2, please fill out this other survey:
https://cornell.yul1.qualtrics.com/jfe/form/SV_cNkhKL69K2Ob7o2
Results from Survey 2
Last updated: 05 September 2025, 20:40
No data yet. Check back later.
The last day serves to review the various materials, handle any questions not addressed (in detail) on the previous days, and discuss experiences and difficulties applying these principles in your work.
Topics may include:
- Advanced self-checking of reproducibility (the more technical parts)
- Preserving raw survey data early in research lifecycle (ethically!)
- Reproducibility for LLM and AI
Guidance
Some additional guidance can be found on the website of the Social Science Data Editors (URLs subject to change):
Examples of replication packages
With confidential data
- https://doi.org/10.3886/E154241V2 not only code, but faces the problem that IRS data cannot have variables revealed. Their workaround is not the same one as in this tutorial.
- https://doi.org/10.3886/E162581V1
Using containers:
- Kline et al (2024) “A Discrimination Report Card: primary replication package, with container specification, image on Docker Hub, and preserved image on Zenodo.
- Herbert et al (2024) “Reproduce to validate”: primary replication package, container specification and preserved image on Borealis.ca.
Extra info
- This document’s source: https://github.com/larsvilhuber/tutorial-reproducibility-2025
- Licensed under