Explora Phase II Beta Release is now live - Training materials discovery is now available.

Note: all times are shown in the timezone in which each event occurs.

Date: 15 July 2026 @ 10:00 - 11:00

Timezone: Pacific Daylight Time

Language of instruction: English

Workshop: Programmatic Data De-identification with R

This practical workshop, delivered by the UBC Library Research Data Management team, introduces programmatic approaches to de-identifying sensitive research data in R. Through hands-on exercises using a realistic survey dataset, participants will apply a structured workflow, from assessing privacy risks to exporting a shareable, de-identified dataset.

Participants will learn how to:

  • Identify privacy risks in research data, including direct identifiers, dates, geographic variables, and free-text fields.
  • Apply de-identification methods in R using dplyr, including removal, generalization, suppression, anonymization, and pseudonymization.
  • Run quality assurance checks to confirm a dataset is sufficiently de-identified before sharing.
  • Export a de-identified dataset and a data key file, and understand best practices for securely storing each.

To participate fully, you will need to install the latest versions of R and RStudio on your computer before the workshop:

Note: This workshop provides a practical introduction to programmatic data de-identification. Participants are encouraged to consult their institutional privacy, legal, or compliance experts for guidance on specific datasets.

 

Location: ONLINE

(A Zoom link will be sent to registrants 3 hours before the event starts.)

Contact: https://libcal.library.ubc.ca/profile/32798

Keywords: Data, Digital Scholarship, Research Commons, Research Data Management

Organizer: Eugene Barsky


Activity log