Preface
The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka!’ but ‘That’s funny…’
In this course, we will use the following tools:
ILIAS: the moodle platform at the UoC. You should all be registered there already.
Campuswire: a chat platform to decrease the number of emails and allow for a more natural exchange between participants and lecturer. You should have received an invitation email, if not, email me, please.
Intended learning outcomes (ILOs)
At the end of this course you should be able to
- Import/read data into R.
- Prepare data for analysis.
- Visualize data.
- Explain and apply statistical methods learnt in this course.
- Combine code and report in a reproducible way.
- Apply selected methods learnt in this course to a new data set and write a reproducible report.
Literature
We will be using the book ModernDive: Statistical Inference via Data Science (Ismay and Kim 2021) mainly. Additionally, I will recommend from time to time R for Data Science (Wickham and Grolemund 2021) and OpenIntro Statistics (Diez, Çetinkaya-Rundel, and Barr 2019). For your report, you will do an additional literature search depending on your topic.
Why these lecture notes
This document is a working and live document that will be updated during the course. It is not comprehensive, but should help you to navigate through the introduction to R and statistics smoothly.
I will use different colour boxes
Infos and tips
Learning outcomes
This is important
This is a definition
- This is an exercise inside a chapter.
Acknowledgements
This document draws on the free material provided by
ModernDive: Ismay and Kim (2021) and their free Problem Sets authored by Jenny Smetzer, William Hopper, Albert Y. Kim, and Chester Ismay (https://moderndive.github.io/moderndive_labs/index.html)
R for Data Science (r4ds): Wickham and Grolemund (2021)
Data Science in a Box (https://datasciencebox.org/) and the free book by Diez, Çetinkaya-Rundel, and Barr (2019)
One cannot thank those people enough for their contribution to the community !
Credit: https://xkcd.com/2400/
Reproducibility
This book was written in RStudio using Bookdown and compiled in R version 4.3.0 (2023-04-21). You will need the following packages to reproduce the examples and to work through the exercises:
package | version | source |
---|---|---|
dabestr | 0.3.0 | Github (ACCLAB/dabestr@8775899f7eba743a6a32bd2fdab5f57e79401fd6) |
emojifont | 0.5.5 | CRAN (R 4.2.0) |
fontawesome | 0.3.0 | CRAN (R 4.2.1) |
gapminder | 0.3.0 | CRAN (R 4.2.2) |
infer | 1.0.3 | CRAN (R 4.2.2) |
lubridate | 1.8.0 | CRAN (R 4.2.0) |
moderndive | 0.5.3 | CRAN (R 4.2.0) |
tidyverse | 1.3.1 | CRAN (R 4.2.0) |
The complete information on the last session to build the book:
## R version 4.3.0 (2023-04-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
## LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3; LAPACK version 3.10.0
##
## locale:
## [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
## [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8
## [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Berlin
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] DT_0.22 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.9
## [5] purrr_0.3.4 readr_2.1.2 tidyr_1.2.0 tibble_3.1.7
## [9] ggplot2_3.3.6 tidyverse_1.3.1 kableExtra_1.3.4 fontawesome_0.3.0
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.0 xfun_0.31 bslib_0.3.1 htmlwidgets_1.5.4
## [5] tzdb_0.3.0 vctrs_0.5.1 tools_4.3.0 generics_0.1.2
## [9] fansi_1.0.3 highr_0.9 pkgconfig_2.0.3 dbplyr_2.1.1
## [13] desc_1.4.1 readxl_1.4.0 assertthat_0.2.1 webshot_0.5.4
## [17] lifecycle_1.0.3 compiler_4.3.0 munsell_0.5.0 htmltools_0.5.2
## [21] sass_0.4.1 yaml_2.3.5 pillar_1.7.0 crayon_1.5.1
## [25] jquerylib_0.1.4 ellipsis_0.3.2 cachem_1.0.6 sessioninfo_1.2.2
## [29] tidyselect_1.2.0 rvest_1.0.2 digest_0.6.29 stringi_1.7.6
## [33] bookdown_0.26 rprojroot_2.0.3 fastmap_1.1.0 grid_4.3.0
## [37] colorspace_2.0-3 cli_3.4.1 magrittr_2.0.3 utf8_1.2.2
## [41] tufte_0.12 broom_1.0.1 withr_2.5.0 scales_1.2.0
## [45] backports_1.4.1 lubridate_1.8.0 modelr_0.1.8 rmarkdown_2.14
## [49] httr_1.4.2 cellranger_1.1.0 hms_1.1.1 memoise_2.0.1
## [53] evaluate_0.15 knitr_1.39 haven_2.5.0 viridisLite_0.4.0
## [57] rlang_1.0.6 downlit_0.4.0 glue_1.6.2 DBI_1.1.2
## [61] xml2_1.3.3 reprex_2.0.1 svglite_2.1.0 rstudioapi_0.13
## [65] jsonlite_1.8.0 R6_2.5.1 systemfonts_1.0.4 fs_1.5.2
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.