Date: 22 February 2023 @ 17:00 - 18:00

Timezone: UTC

Language of instruction: English

Topic: "Accelerated DataFrame with Dask-cuDF on multiple GPUs"

Speaker: Jinhui Qin, SHARCNET

Video link

Recording

--- 

cuDF is a GPU DataFrame library in Python. It provides a Pandas-like API with accelerated performance for DataFrame operations on a single GPU. However, dealing with large datasets is limited by the memory available on a single GPU. Since Dask provides a framework for scalable computing, Dask-cuDF integrates cuDF with Dask to allow scaling a large DataFrame workload across multiple GPUs. This webinar introduces Dask-cuDF with demo examples on a multi-GPU node on the national clusters.

The Compute Ontario Colloquia are weekly Zoom presentations on Advanced Research Computing, High Performance Computing, Research Data Management, and Research Software topics, delivered by staff from three Compute Ontario consortia (CAC, SciNet, SHARCNET) and guest speakers. The series began January 2023 and superseded similar series previously delivered by individual consortia (e.g. General Interest Seminars by SHARCNET or User Group Meeting TechTalks by SciNet). The colloquia are one hour long and include time for questions. No registration is required. Presentations are usually recorded and uploaded to the hosting consortium video channel (colloquia hosted by SHARCNET go to our youtube channel).

Keywords: GPU, HPC, Python, Programming, Statistics, Data Analysis


Activity log