Introduction To Building A Distributed Neural Network on Apache Spark With BigDL And Analytics Zoo

Abstract: In this training session you will get hands on experience with developing neural network using Intel BigDL and Analytics Zoo on Apache Spark. You will learn how you can use Spark DataFrames and build deep learning pipelines through implementing some practical examples.
Target Audience: AI developers and aspiring data scientists who are Experienced in Python and Spark. Also big data and analytics professionals interested in neural networks

Training outline:

1. Introduction to Deep Learning on Spark, BigDL and Analytics Zoo
We will begin with a brief introduction to Apache Spark and the Machine Learning/Deep Learning ecosystem around Spark. Then we will introduce Intel BigDL and Analytics Zoo, two deep learning libraries for Apache Spark. We will go into the architectural details of how distributed training happens in BigDL. We will cover the model training process, including how the model, weights and gradients are distributed, calculated, updated and shared with Apache Spark.

2. Setting Up Sample Environment
3. Quick and simple image recognition use case with BigDL
4. Transfer Learning for Image Classification Models
5. Anomaly Detection or Recommendation system with Intel Analytics Zoo
6. Model Serving
7. How to build an end to end pipeline and put their model to production.
8. Discussion of practical experience using Spark and Hadoop for machine learning and deep learning projects

Bio: Bala Chandrasekaran is a Technical Staff Engineer at Dell Technologies, where he is responsible for building machine learning and deep learning infrastructure solutions. He has over 15 years of experience in the areas of high performance computing, virtualization infrastructure, cloud computing and big data.