Overview

Course Description

This course provides a broad introduction about techniques used to analyze Big-Data.

The course will also present a variety of tools that are leading the industry when it comes to handling and working with Big-Data.

Most part of tech stack adopted throughout the course is based on the Apache Spark framework.

Teachers

  • Andrea Tagarelli
  • Antonio Caliò (Teaching Assistant)

Roadmap

  1. Introduction to Scala and its Build System

    • Scala Crash Course
    • Overview of the main Scala build system: sbt
  2. The Spark ecosystem

    • Introduction to the Spark ecosystem
    • Example projects in Scala
  3. Useful Big-Data Libraries

    • Spark-SQL
    • Spark-Streaming
    • Spark-MLLib
    • SPark-GraphX
  4. Integration with other system and deploy

    • Apache Kafka
    • Deploy on a real cluster