RAPIDS: GPU Data Science

Abstract: Python has seen terrific progress as the data science language of choice. With the introduction of Pandas, users could interact with data in python in a way that fells intuitive. In addition, open-source packages such as Scikit-Learn have democratized and accelerated data science.

RAPIDS seeks to have a similar impact on the Python data science community by accelerating data science with GPUs. RAPIDS is an open-source suite of tools for GPU data science. Launched in October, RAPIDS includes cuDF, a library for reading data to the GPU and interacting with it in an Pandas’s like way; cuML, a library for machine learning that follows the Scikit-Learn API; and cuGraph, a graph analytics library similar to NetworkX. The RAPIDS libraries use Dask and Numba to scale to multiple GPUs across nodes and JIT compile User Defined Functions (UDF) respectively to allow users to do large scale data science problems while leveraging Python as a high-performance language running on the GPU. In addition, RAPIDS has interoperability with numerous other libraries and Deep Learning frameworks to simplify end to end data science workflows.

This talk will walk through the libraries in RAPIDS, as well as show examples of how simple it is to use in various machine learning and deeplearning workflows. In addition, we will show how to install RAPIDS in various ways (container, pip, conda, and from source) on cloud or local environments. Finally, we will talk about the evolution of the libraries, new functionality coming soon, and the long term direction of RAPIDS.

Bio: Joshua Patterson, Director of AI Infrastructure at NVIDIA, leads engineering for RAPIDS.AI, and is a former White House Presidential Innovation Fellow. Prior to NVIDIA, Josh worked with leading experts across public sector, private sector, and academia to build a next-generation cyber defense platform. His current passions are graph analytics, machine learning, and large-scale system design. Josh also loves storytelling with data and creating interactive data visualizations. Josh holds a B.A. in economics from the University of North Carolina at Chapel Hill and an M.A. in economics from the University of South Carolina Moore School of Business.