Detailed information about the course

[ Back ]

Reproducible Data Analysis With R


17-21 April 2023

Lang EN Workshop language is English

Dr Sergio Vignali, UNIBE


The computational part of a research is considered reproducible when other scientists (including ourselves in the future) can obtain identical results using the same code, data, workflow and software. Research results are often based on complex statistical analyses which make use of various software. In this context, it becomes rather difficult to guarantee the reproducibility of the research, which is increasingly considered a requirement to assess the validity of scientific claims.

During this one-week course, the participants will be introduced to a suite of tools they can use in combination with R to make reproducible the computational part of their own research.

On day 1 the students learn about the most important aspects that make research reproducible, which go beyond simply sharing R code. This includes problems arising from the use of different packages versions, R versions, and operating systems. The concept of research compendium is introduced and proposed as general framework to organise any research project.

Day 2 provides a comprehensive introduction to version control using Git and GitHub, which are fundamental tools for keeping track of code changes and for collaborating with other people on the same project. Students will also be introduced to literate programming using RMarkdown to create reproducible reports.

On day 3 the participants learn how to use RMarkdown to write their own article so that the outputs of the R analysis (i.e. results, tables, and figures) are bound together with the text. Students will also learn how to use templates to fulfil requirements of different journals.

Day 4 is dedicated to data pipelines and workflows using GNU make, a very useful tool to run complex analysis in an efficient way, particularly when the analysis involves interdependencies between several files.

Finally, the last day is dedicated to Docker, a popular tool to create reproducible computational environments.

On each day, students will get an introduction to a different tool and practice its use together with the teacher on provided examples. Each of these tools are crucial to make any research analysis reproducible. Furthermore, on the afternoon sessions, the participants have the possibility to put what they have learned into practice and apply the newly acquired methods to their own analysis with the supervision of the instructor. The goal is that at the end of the course each student will be able to create a fully reproducible research compendium.

The participants are required to have some previous experience with R and must bring their own laptop with R and RStudio installed. Participants should also be able to install additional software in their own computer during the course.






Reimbursements for CUSO StarOmics students: - Train ticket, 2°class, half-fare from your institution to the place of the activity.

NEW from 2021: Reimbursement of your travel tickets can be asked online through your MyCUSO.

See HERE for the procedure.

For any question concerning reimbursement please contact the CUSO StarOmics coordinator Corinne Dentan

Other CUSO Students: Please contact the coordinator of your Program.



Deadline for registration
Joint activity joint
short-url short URL

short-url URL onepage