Activity: Clean a sample dataset
- Individually or in small groups. Could also be a large group walkthrough.
- Select a small dataset that needs to be cleaned. Or, find a cleaned/structured dataset and add some issues to the dataset. Note that you have made alterations to the original dataset if you select the second option.
- Ask attendees to clean the dataset while documenting along the way. Can be done using a specialized tool, such as OpenRefine, or Google Sheets or Excel.
- Discussion questions:
- What were some trends that you noticed in the dataset that you had to address in cleaning?
- What were some of the challenges you encountered? Where did those challenges originate (dataset, tool, skills, etc.)?
Total recommended time
- Data cleaning software
- Simple, interesting dataset that needs to be cleaned
- Computers for each participant or group
- If virtual, breakout rooms for groups