Skip to content

Latest commit

 

History

History
125 lines (82 loc) · 3.28 KB

README.md

File metadata and controls

125 lines (82 loc) · 3.28 KB

Introduction to data analysis and linear regression in R

This repository contains content for a 5-day course for new PHD students (and other interesting people), run within the School of Biological, Earth & Environmental Sciences (BEES) at the University of New South Wales.

Details for this session are as follows:

  • Dates:
    • Monday 03 February to Tuesday 04 February (9.00am to 5.00pm) - Data Manipulation and Visualisation
    • Monday 10 and Tuesday 11 February – Introduction to design and analysis + Introduction to linear modelling
  • Audience: New HDR or Hons students in BEES
  • Venue: BEES Teaching Lab 3, Ground Floor D26
  • What to bring: your laptop
  • Presenters
    • Daniel Falster (BEES)
    • Will Cornwell (BEES)
    • Dony Indiarto (BEES)
    • Eve Slavich (Stats Central)
    • Gordana Popovic (Stats Central)

Aims & Content

Day 1 – Introduction to R (for new beginners) [ Will Cornwell ]

Getting started with R

  • Introduction to Rstudio
  • Introduction to coding in R
  • Getting data in and out of R - R objects and classes
  • Packages

Days 2-3 Project management, data manipulation & data visualisation [ Daniel Falster, Will Cornwell, Dony Indiarto ]

Topics

  • Projects: Organising and managing data - Reproducible research with Rmarkdown Data manipulation & visualisation with the tidyverse
  • Data manipulation with the tidyverse
  • Data visualisation with ggplot

Lesson plan (Day 1)

  • 9:30 Intro (Dan)

  • 9:45 Getting organised: Projects, path names, folders (Dan)

  • 10:30 Rmd files (Dan)

  • 11:00 MORNING TEA

  • 11:15 Reading data with readr (Dan)

  • 11:45 Data manipulation with dplyr (Dan)

    • filter, select, mutate, rename, arrange, summarise,
    • pipes
  • 12:30 LUNCH

  • 13:30 Imagine your plot (Will)

  • 14:30 Intro to data visualisation with ggplot (Dony)

  • 15:15 AFTERNOON TEA

  • 15:30 Exercises

Lesson plan (Day 2)

  • 9:30 Tidy Data concept (Dony)

    • pivots
  • 10:00 Advanced data manipulation with dplyr (Dan)

    • group_by (summarise, mutate),
    • join
  • 11:00 MORNING TEA

  • 11:15 Advanced data visualisation with ggplot (Will)

    • (extend plots from Day 1 in various ways)
    • facets
    • styles: themes, scales, labels, palettes
    • multiple plot layouts with patchwork
  • 12:30 LUNCH

  • 13:30 Data wrangling & visualisation challenge (Dan)

  • 15:15 AFTERNOON TEA

  • 15:30 Extensions

    • ggplot in talks (Rose O'Dea)
    • ggplot extensions (Will)
    • Reproducible research (Dan)

Day 3-4 Introduction to design and analysis and linear modelling [ Eve Slavich and Gordana Popovic]

Introduction to statistics

  • Which method do you use when? - Statistical inference
  • Two-sample t-test

Introduction to Experimental design

  • Sample sizes
  • Treatments

Linear regression

  • Linear regression
  • Equivalence of two-sample t and linear regression

Linear models

  • Multiple regression
  • Analysis of variance (and equivalence to multiple regression)

Weirder linear models

  • Blocked and paired designs - ANCOVA
  • Factorial experiments
  • Interactions in regression

Installation instructions

The course assumes you have the R software and the development environment RStudio installed on your computer.

R can be downloaded here.

The Desktop version of RStudio can be downloaded here.