Date: 20 September 2023 @ 16:00 - 17:00

Timezone: UTC

Language of instruction: English

Topic: "Data Wrangling with Tidyverse"

Speaker: Tyson Whitehead, SHARCNET

Video link

Recording

 

--- 

Tidyverse is an cohesive set of packages for doing data science in R. We have demonstrated the graphics portion of this in prior talks (ggplot). In this one we are going to demonstrate the data munging portions (dplyr, forcats, tibble, readr, stringr, tidyr, and purr) by restoring the underlying data hierarchy implicit in the layout of a 500 pages reference PDF file given only the words on each page and their bounding boxes.

The Compute Ontario Colloquia are weekly Zoom presentations on Advanced Research Computing, High Performance Computing, Research Data Management, and Research Software topics, delivered by staff from three Compute Ontario consortia (CAC, SciNet, SHARCNET) and guest speakers. The series began January 2023 and superseded similar series previously delivered by individual consortia (e.g. General Interest Seminars by SHARCNET or User Group Meeting TechTalks by SciNet). The colloquia are one hour long and include time for questions. No registration is required. Presentations are usually recorded and uploaded to the hosting consortium video channel (colloquia hosted by SHARCNET go to our youtube channel).


Activity log