Course Syllabus for

Programming Models and Practice for Big Data
Programmeringsmodeller och metoder för att hantera stora datamängder

EDA025F, 7.5 credits

Valid from: Autumn 2015
Decided by: FN1/Anders Gustafsson
Date of establishment: 2015-09-08

General Information

Division: Computer Science (LTH)
Course type: Third-cycle course
Teaching language: English


This course will teach the doctoral students how to analyze and design programs for big data. It will provide knowledge on big data architectures, languages, and ecosystems with a focus on Spark. The techniques presented in the course are expected to have high impacts in a variety of fields such as data analysis, customer recommendation, trend prediction, pattern recognition, etc.


Knowledge and Understanding

For a passing grade the doctoral student must

Competences and Skills

For a passing grade the doctoral student must show her/his capability to operate big data architectures and design and write programs using Spark.

Judgement and Approach

For a passing grade the doctoral student must show the ability to select and assess architectures and algorithms for big data problems.

Course Contents

The course consists of four full-day sessions that will address: 1/ Cloud architectures, Spark concepts, and Spark programming. 2/ Intermediate and advanced Spark. 3/ Supervised machine-learning with Spark: MLlib and MLlib programming. 4/ Unsupervised machine learning.

Course Literature

Instruction Details

Types of instruction: Lectures, laboratory exercises, exercises, project

Examination Details

Examination format: Written assignments. The assessment will consist of programs and reports to hand in
Grading scale: Failed, pass

Admission Details

Assumed prior knowledge: Good programming skills in Java, Scala, or Python. Knowledge of statistics

Course Occasion Information

Contact and Other Information

Course coordinator: Pierre Nugues <>

Complete view