This video shows how to perform interactive IIS log analysis and visualization with Python (PySpark), Jupyter notebook and custom Python library on Azure HDInsight Linux Spark cluster.

Video Table Content:

[00:41] Goal

[01:22] Agenda

[02:00] Pre-requisites

[02:44] What is Spark

[03:43] Azure HDInsight Spark (Linux)

[04:53] Management Dashboard Snapshot

[05:49] What is Jupyter Notebook

[07:26] What is RDD

[08:13] RDD Operations

[08:50] Python Spark API (PySpark) and Libraries

[10:35] Code Walk-through Overview

[12:06] Demo

[23:15] Sample Code GitHub Repository

[23:40] Useful Resources and References