A self-paced online workshop developed by staff at Griffith University Library.
OpenRefine is an open source tool to explore, clean, organise, combine and transform data. You can use it to standardise messy or inconsistent data. OpenRefine is particularly powerful when working with large datasets as it can accommodate spreadsheets and files containing millions of rows or lines of text.
Learn basic data cleaning techniques in this self-paced online workshop such as:
- exploring tabular data through facets and filters
- implementing ‘tidy data’ principles
- cleaning, organising and preparing data for analysis
- extracting and using a script to automate wrangling on similar data
Download the software and dataset, do activities and watch videos to guide you through the lessons. Give yourself around 2 1/2 hours to complete the workshop.
All materials in these lessons are licensed CC BY.
Content for this lesson was adapted from Data Carpentry & Library Carpentry lessons.
Workshop-template-b by evanwill is built using Jekyll on GitHub Pages. The site is styled using Bootstrap with FontAwesome icons.
Griffith University - CRICOS Provider Number 00233E.