Overview
Course Description
This course provides a broad introduction about techniques used to analyze Big-Data.
The course will also present a variety of tools that are leading the industry when it comes to handling and working with Big-Data.
Most part of tech stack adopted throughout the course is based on the Apache Spark framework.
Teachers
- Andrea Tagarelli
- Antonio Caliò (Teaching Assistant)
Roadmap
-
Introduction to Scala and its Build System
- Scala Crash Course
- Overview of the main Scala build system: sbt
-
The Spark ecosystem
- Introduction to the Spark ecosystem
- Example projects in Scala
-
Useful Big-Data Libraries
- Spark-SQL
- Spark-Streaming
- Spark-MLLib
- SPark-GraphX
-
Integration with other system and deploy
- Apache Kafka
- Deploy on a real cluster