Engineering Distributed Infrastructure 1000-2M20IRIO
The course consists of lectures, exercises and a project.
Lectures and exercises will cover the following topics:
- Compute as a service: an introduction to the architecture of cloud computing
- Managing distributed resources, on scale: scheduling and autoscaling
- Communication within and outside the cloud: methods and APIs
- Designing for scalability
- Designing and managing systems for reliability and maintainability
- Monitoring
- Testing
- Managing distributed data
The project consists of designing, building and deploying a distributed application on a public cloud. We will suggest a few ideas for applications. The project is done in a team of 3 students. Each team will have an assigned Googler tutor who will help to scope the project; review design documents and grade the final solution.
Course coordinators
Type of course
Prerequisites (description)
Learning outcomes
Knowledge:
Students understand the issues of large-scale distributed computing systems.
Skills/Expertise:
Students can design and develop a complex, low-level component of a distributed computing infrastructure.
Assessment criteria
- Active participation in the exercises
- Project (in a 3-person team)
- Final exam
To pass, you need to actively attend exercises, complete the project and pass the final exam.
Bibliography
Software Engineering at Google, Titus Winters, Tom Manshreck, Hyrum Wright, 2020, O'Reilly Media.
Site Reliability Engineering, Besty Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy (Eds.), 2016, O'Reilly Media.
Additional bibliography (papers, technical documentation) will be given during the lectures.