Detailed information about the course

[ Back ]

Reproducible and Collaborative Data Analysis With R


2-6 Sept 2024

Lang EN Workshop language is English

Andrin Dürst, UNIBE


Dr. Sergio Vignali, University of Bern


The computational part of a research is considered reproducible when other scientists (including ourselves in the future) can obtain identical results using the same code, data, workflow and software. Research results are often based on complex statistical analyses which make use of various software. In this context, it becomes rather difficult to guarantee the reproducibility of the research, which is increasingly considered a requirement to assess the validity of scientific claims. During this one-week course, the participants will be introduced to a suite of tools they can use in combination with R to make reproducible the computational part of their own research.

On day 1 the students learn about the most important aspects that make research reproducible, which go beyond simply sharing R code. This includes problems arising from the use of different packages versions, R versions, and operating systems. The concept of research compendium is introduced and proposed as general framework to organise any research project.

Day 2 provides a comprehensive introduction to version control using Git and GitHub, which are fundamental tools for keeping track of code changes and for collaborating with other people on the same project. Students will also be introduced to literate programming using Quarto, the new scientific and publishing system recently released by RStudio as successor to R Markdown.

On day 3 the participants learn how to use Quarto to write their own article so that the output of the R analysis (i.e. results, tables, and figures) are bound together with the text. Students will also learn how to use templates to fulfil requirements of different journals.

Day 4 is dedicated to data pipelines and workflows using GNU make, a very useful tool to run complex analysis in an efficient way, particularly when the analysis involves interdependencies between several files.

Finally, the last day is dedicated to Docker, a popular tool to create reproducible computational environments.

On each day, students will get an introduction to a different tool and practice its use together with the teacher on provided examples. Each of these tools are crucial to make any research analysis reproducible. The goal is that at the end of the course each student will be able to create a fully reproducible research compendium.





Deadline for registration
short-url short URL

short-url URL onepage