Meeting Time: M/Th 2:10pm - 4:00pm; W 2:10pm - 6:00pm
Location: Dickinson Computer Science Lab (Dickinson 235)
Office Hours: M 1-2pm, W 11am-12pm, Dickinson 211
Course Web Site: http://cs.bennington.edu/courses/f2013/cs4125.01/
In this class, we will, as a group, build a working distributed system from scratch, such as a web search engine, distributed file system, or peer-to-peer network. By building such a system, students will learn about key theoretical and practical fundamentals related to distributed systems, such as concurrency, replication, commit models, fault-expectancy, self-organization and management, load-balancing, capacity planning, and physical and environmental considerations. These key principles are what lie at the core of the designs of well-known systems such as built by Google, Bing, Facebook, Yahoo, Twitter and others. The class will evolve from working through the design of the system, to developing it, planning its deployment, and releasing it into the wild.
It is assumed that you have had at least one semester of programming-intensive computer science prior to taking this class. There are no assumptions made about programming languages, design skills, operating system or networking concepts, etc.
This class will expose you to the following skills, technologies and languages:
This class will take on the format of an intensive workshop centered around exploring the theory and implementation of distributed systems. This will take the following form:
Discussions: we will discuss the readings assigned for a given class session for some amount of time. Please read, take notes, and be prepared to discuss, question and critique the readings.
Lectures: some material that we will cover is well-suited to a brief lecture to help elucidate the key points and design. Often, lecture and discussion will be paired together, with the lecture serving to ease us into discussion of advanced material.
Labs: nearly every week, we have a lab session scheduled. The labs are hands-on activities designed to help you implement and test various theories and concepts. We generally will use the Wednesday session for lab work, though one or two labs may spill over to the following Thursday.
Project sessions: as the class progresses, we will spend an increasing amount of time working on the central project: an implementation of Google File System, a distributed storage system. These sessions will take various formats, and you will assume a variety of roles during these sessions - by the end of class, you will have had the experience of working on a large, complex systems project.
Reading computer science research can be time consuming. For the average 12-page research paper, plan to spend at least 3 hours reading, annotating, and re-reading the paper. The more you become accustomed to reading research papers, the easier (and quicker) this process becomes, but it is deceptively time-intensive. Plan accordingly.
The programming projects in this class are designed to be extremely enjoyable to work on, and should present an average level of conceptual difficulty. Plan to spend 4-8 hours on a programming assignment on average. Remember to start early, as programming under pressure rarely increases productivity. Additionally, debugging problems in distributed systems can be quite tricky and time-consuming as well.
The group project for class will use a combination of in-class and outside-of-class time. We will shift from individual assignments to periodic group project tasks and milestones as the class progresses. This will challenge your time management and collaboration skills, and may require considerable time, particularly in the last third of the semester.
One of the very best ways of learning about distributed systems - as well as computer science, in general - is to build something. In this class, we will work together to build an implementation of a distributed storage system, the Google File System. Our implementation will be special in the sense that it is built over a cluster of Raspberry Pi single-board computers, with attached SATA hard disks. We may also add a few bells and whistles as we go.
We will begin by reading and discussing the Google File System paper. You will then work together to break the ideas down into a design. The design will then be validated and turned into individual tasks that you will delegate to each other for implementation. As a complete implementation grows near, you will also design a set of tests and measurements inspired by those in the GFS paper. You will perform those tests and measurements, and collect data on how your system performs (and make adjustments as needed).
The final output of your group project will be a poster presentation and demo at the final Science Workshop poster/demo day at the end of the semester (date: Friday TBD, 1-2pm; participation is required). You are not totally alone in this process - I will be there every step along the way to help make sure you can make adequate progress.
There is no textbook required for this class. You will be reading and discussing current research literature and excerpts from a variety of sources related to distributed systems. Readings will be provided either in hard-copy form, or via the course web site.
- You will attend every class. More than two absences (excused or unexcused) will jeopardize your standing in the course.
- You will check-in all required assignments prior to the start of the class in which they are due.
- You will be a productive and positive collaborator with your colleagues on the group project.
- You will be an attentive and positive contributor to class discussion and activities.
- You will seek out help promptly if you are struggling or falling behind.
- You will submit your own ideas and work. Academic dishonesty will not be tolerated, and will be passed along without exception to college authorities.
- Class participation and attendance (40%).
- Assignments and exercises (30%).
- Group project & presentation (30%).
If you are struggling in class, or would like to investigate a topic in greater depth, come see me. My office hours are listed on the top of this syllabus. I enjoy and look forward to meeting with you - some general guidance on making sure we are able to meet:
- I strongly prefer email (email@example.com). I am on it way too much, so you'll likely get a reply within 12 hours unless I am extremely busy.
- If you would like to meet with me, please consult my schedule (located at http://cs.bennington.edu/people/acencini) and propose a date and time that is not generally booked.
- If you plan to drop by during my office hours, it doesn't hurt to email in advance - I like to know if you are planning to show up, and can also let you know if there might be a wait.
- If you need to meet me outside of my office hours, 18 hours notice is strongly suggested.
Subject to change. Readings and assignments will be disseminated in class.
Week Date Topic Week 1 9/4/2013 Introduction, skills, hardware [Reading 1] 9/5/2013 Distributed Systems - Design [Reading 2] Week 2 9/9/2013 Naming & DNS [Reading 3] [Optional Reading A] [Optional Reading B] 9/11/2013 RPC Lab [Reading 4] 9/12/2013 Time & Synchronization [Reading 5] Week 3 9/16/2013 Communication - TCP/UDP 9/18/2013 Communication - TCP/UDP [GFS PAPER!] 9/19/2013 Distributed Storage [ASSIGNMENT 1] Week 4 9/23/2013 Threads, IPC, and Advanced Python 9/25/2013 GitHub, Threads and IPC - Lab (Python) 9/26/2013 Threads and IPC Week 5 9/30/2013 Threads in C [Reading] 10/2/2013 Thread Lab Conclusion (C) [ASSIGNMENT 2] 10/3/2013 Lamport Clocks, GFS Discussion II Week 6 10/7/2013 GFS Design Discussion III 10/9/2013 Amazon Web Services (AWS) Lab [Assignment 2 Due] 10/10/2013 Transactions and Consistency [Reading - Up to section 7.5] Week 7 10/14/2013 2-Phase Commit 10/16/2013 Transactions Lab 10/17/2013 2PC Competition 1 Week 8 10/23/2013 P2P - Tor [Reading] 10/24/2013 2PC Competition 2 Week 9 10/28/2013 System Monitoring & Management 10/30/2013 P2P - Chord [Reading] Week 10 11/4/2013 Fault Tolerance & Resiliency 11/7/2013 P2P - Chord [Reading] [Reading] [ASSIGNMENT 3] Week 11 11/11/2013 P2P - Bittorrent/BitTyrant 11/13/2013 Storage / Project Lab (M1) [Reading] 11/14/2013 Data Center Design [Reading] [Reading] Week 12 11/18/2013 Data Center Design [Reading] [Reading] [Reading] 11/20/2013 Hadoop Lab 11/21/2013 Web Search [Reading] Week 13 11/25/2013 Green Computing [Reading] [Reading] Week 14 12/2/2013 Consensus [Reading] [Reading] 12/4/2013 Green Computing [Reading] 12/5/2013 Virtualization [Reading] Week 15 12/9/2013 Storage / Project Lab 12/11/2013 Storage / Project Lab 12/12/2013 Storage / Project Lab