
Reproducibility in research – Ensuring the transparency and credibility of your work
- Part 1: May 28, 2026 (9:00 to 12:00 )
- Part 2: May 28, 2026 (12:30 to 15:00 )
Location: SFU Harbour Centre, 515 West Hastings Street, room 2945 McLean Management Studies Lab (located on the second floor)
Registration required (see registration link).
Sponsored by Alfred P. Sloan Foundation grant to Cornell University (Lars Vilhuber) and by the Research Chair in Intergenerational Economics, ESG UQAM (Marie Connolly). Part of the Canadian Economics Association meetings pre-program.



- 9:00 Welcome
- 9:05 Walkthrough
- 9:15 Goals
- 9:30 Technical setup, possible team formation
- 9:45 🔒Hands-on Exercise: A very imperfect example
- 10:00 Discussion of the “Very imperfect example”
- 10:30 Break (15 minutes)
- 10:45 Day 1:
- 12:00 Lunch Break (30 minutes)
- 12:30 How to run Stata! or R! (reproducibly)
- 12:45 Extra: How to install Stata packages
- 1:00 Topic A (see Survey)
- 1:45 Topic B (see Survey)
- 2:30 Hands-on: Improving the replication package (very imperfect -> a lot better)
- 2:50 Hands-on: Testing it all
- 2:55 Wrap up
We want to get to know you a bit better, so we can adapt this and future workshops to your needs. If you haven’t done so already, please do fill out this short survey.
Results
Last updated: 28 May 2026, 16:17
All responses between 30 April 2026 and 29 May 2026.
Education
| Degree | Frequency | Frequency (Complete) | Percent |
|---|---|---|---|
| graduate student (Ph.D.) | 3 | 3 | 30 |
| graduate student (Masters) | 4 | 4 | 40 |
| faculty member (tenured) | 2 | 2 | 20 |
| Other | 1 | 1 | 10 |
| Note: | |||
| Percent is calculated as the proportion of completed responses. |
Operating System by Educational Category
| OS | Academic | Other | Overall |
|---|---|---|---|
| Mac | 11.1% | 100.0% | 20.0% |
| Windows | 88.9% | 0.0% | 80.0% |
| Note: | |||
| Completed responses only. Multiple mentions possible. Cells show percent of OS mentions by group. |
Intermediate or Advanced Programming Language Usage
| Programming Language | Frequency | Percent |
|---|---|---|
| R | 2 | 20 |
| Stata | 7 | 70 |
| Note: | ||
| Completed responses only. Multiple mentions possible. |
Command Line Usage by Educational Category
| Feature | Academic | Other | Overall |
|---|---|---|---|
| Command line used often | 22.2% | 0.0% | 20.0% |
| Note: | |||
| Completed responses only. Cells show percent of feature mentions by group. |
For Part 2, we have a choice:
- Reproducibility when some data are confidential
- Preserving raw survey data
- Advanced self-checking of reproducibility
To identify what we should be speaking about in the afternoon, please fill out this other survey:
https://cornell.yul1.qualtrics.com/jfe/form/SV_cNkhKL69K2Ob7o2
Results
| Preferred_topic | Frequency | Percent |
|---|---|---|
| Advanced self-checking of reproducibility | 17 | 56.67 |
| Preserving raw survey data | 1 | 3.33 |
| Reproducibility when some data are confidential | 12 | 40.00 |
- Discussions on concerns about ethics in data dissemination (privacy, scraping, copyright, etc.)
- Reproducibility for LLM and AI
Guidance
Some additional guidance can be found on the website of the Social Science Data Editors (URLs subject to change):
Examples of replication packages
With confidential data
- https://doi.org/10.3886/E154241V2 not only code, but faces the problem that IRS data cannot have variables revealed. Their workaround is not the same one as in this tutorial.
- https://doi.org/10.3886/E162581V1
Using containers:
- Kline et al (2024) “A Discrimination Report Card: primary replication package, with container specification, image on Docker Hub, and preserved image on Zenodo.
- Herbert et al (2024) “Reproduce to validate”: primary replication package, container specification and preserved image on Borealis.ca.
Extra info
- This document’s source: https://github.com/larsvilhuber/summer-school-sfu-ubc-2026
- Licensed under

Coming.
