DISTRIBUTED SYSTEMS
>> SPRING 2020 | [ SCHEDULE ] | [ PDF ] | [ GitHub ]
Instructor: Andrew Cencini (acencini@bennington.edu)
Credits: 4
Meeting Time: M/Th 3:40pm - 5:30pm; W 2:10pm - 6:00pm (lab)
Location: CATLab (Dickinson 235)
Office Hours: Tu 3-4pm, Th 2-3pm - Dickinson 211
Course Web Site: http://cs.bennington.edu/courses/s2020/cs4280.01/

SUMMARY:
In this class, we will, as a group, build a working distributed system from scratch, such as a web search engine, distributed file system, blockchain/distributed ledger, or peer-to-peer network. By building such a system, students will learn about key theoretical and practical fundamentals related to distributed systems and software engineering, such as concurrency, replication, commit models, fault-expectancy, self-organization and management, load-balancing, capacity planning, network programming, containerization and microservices, and physical and environmental considerations. These key principles are what lie at the core of the designs of well-known systems such as those built by Google, Bing, Facebook, Yahoo, Twitter and others. The class will evolve from reading and discussing research and working on foundational programming projects, to working through the design of the system, developing it, planning its deployment, and releasing it into the wild.

SKILLS:
It is assumed that you have had at least one semester of programming-intensive computer science prior to taking this class. There are no assumptions made about programming languages, design skills, operating system or networking concepts, etc.

This class will expose you to the following skills, technologies and languages:

  • C, Python, RPC, TCP/UDP & sockets, basic networking, Linux, HTML/HTTP, MapReduce, Java, Threads/pthreads, blockchain, Docker/microservices.

    FORMAT:
    This class will take on the format of an intensive workshop centered around exploring the theory and implementation of distributed systems. This will take the following form:

    Discussions: we will discuss the readings and exercises assigned for a given class session for some amount of time. Please read, take notes, and be prepared to discuss, question and critique the readings.

    Lectures: some material that we will cover is well-suited to a brief lecture to help elucidate the key points and design. Often, lecture and discussion will be paired together, with the lecture serving to ease us into discussion of advanced material.

    Labs: nearly every week, we have a lab session scheduled. The labs are hands-on activities designed to help you implement and test various theories and concepts. We generally will use the Monday session for lab work, though one or two labs may spill over to the following regular class session.

    Project sessions: as the class progresses, we will spend an increasing amount of time working on the central group project. These sessions will take various formats, and you will assume a variety of roles during these sessions - by the end of class, you will have had the experience of working on a large, complex distributed systems project.

    WORKLOAD:
    Reading computer science research can be time consuming. For the average 12-page research paper, plan to spend at least 3 hours reading, annotating, and re-reading the paper. The more you become accustomed to reading research papers, the easier (and quicker) this process becomes, but it is deceptively time-intensive. Plan accordingly.

    The programming projects in this class are designed to be extremely enjoyable to work on, and should present an average level of conceptual difficulty. Plan to spend 4-8 hours on a programming assignment on average. Remember to start early, as programming under pressure rarely increases productivity. Additionally, debugging problems in distributed systems can be quite tricky and time-consuming as well, particularly in comparison to standalone debugging.

    The group project for class will use a combination of in-class and outside-of-class time. We will shift from individual assignments to periodic group project tasks and milestones as the class progresses. This will challenge your time management and collaboration skills, and may require considerable time, particularly in the last third of the semester.

    GROUP PROJECT:
    One of the very best ways of learning about distributed systems - as well as computer science, in general - is to build something. In this class, we will work together to build an implementation of a distributed system (to be agreed upon by the group). We may also add a few bells and whistles as we go depending on the skills and interests of members of the class. You are not totally alone in this process - I will be there every step along the way to help make sure you can make adequate progress.

    TEXTBOOK:
    There is no textbook required for this class. You will be reading and discussing current research literature and excerpts from a variety of sources related to distributed systems. Readings will be provided either in hard-copy form, or via the course web site.

    REQUIREMENTS AND ACADEMIC INTEGRITY:

    • You will attend every class. More than two absences (excused or unexcused) will jeopardize your standing in the course.
    • You will check-in all required assignments prior to the start of the class in which they are due.
    • You will be a productive and positive collaborator with your colleagues on the group project.
    • You will be an attentive and positive contributor to class discussion and activities.
    • Your participation and presence in class and other activities will foster a safe and welcoming environment for all others in the class.
    • You will seek out help promptly if you are struggling or falling behind.
    • You will submit your own ideas and work. Academic dishonesty will not be tolerated, and will be passed along without exception the appropriate administrative or judicial entity.

    EVALUATION:

    • Class participation and attendance (30%).
    • Assignments and exercises (30%).
    • Group project & presentation (40%).
    There will be approximately 4 moderate programming assignments to be completed individually in the class in addition to the larger group project. I will post a signup sheet on my office door shortly after all submissions have been received, and will meet with you in person to review your code and solutions. If you are unable to meet in person, or prefer written feedback, I will provide a marked up copy of your code with comments and feedback.

    Additional, smaller exercises and labs will also be assigned throughout the term - these will count towards your participation and attendance evaluation.

    GETTING HELP:
    If you are struggling in class, or would like to investigate a topic in greater depth, come see me. My office hours are listed on the top of this syllabus. I truly enjoy and look forward to meeting with you - some general guidance on making sure we are able to meet:

    • I strongly prefer email (acencini@bennington.edu). Please allow 24 hours for a response, perhaps a bit longer on weekends - though in all cases, I will do my best to get back to you as soon as possible.
    • If you would like to meet with me, please consult my schedule (located at my page) and propose a date and time that is not generally booked.
    • I hang up a signup sheet each week outside of my office for office hours. Walk-ins may be possible but are not at all guaranteed. The sheet for the following week usually goes up on Friday after lunch.
    • If you need to meet me outside of my office hours, making an appointment 24 hours or more ahead of time is strongly suggested.
    Additionally, I have a large selection of hardware, software and print materials that may be of interest for coursework or independent projects. Feel free to stop by and inquire about what is available and what may be borrowed or used!

    SCHEDULE:
    Subject to change. Readings and assignments will be disseminated in class.

    Week	Date		Topic
    Week 1	2/19/2020	Introduction [Reading 0] [Reading 1]
    	2/20/2020	Naming & DNS, General Theories [Reading 2] [Reading 3] [Reading 4]
    Week 2	2/24/2020	Microservices & Containerization (Docker) [Reading 5] [Reading 6] [Reading 7]
    	2/26/2020	Microservices & Communication (REST) - Docker Lab [Lab 0] [Dockerfile] [flaskapp.py]
    	2/27/2020	Microservices & Communication (RPC) [Reading 8] [Assignment 1] [Optional Lab - RPC/C] [add.x]
    Week 3	3/2/2020	Time and Synchronization - NTP, Lamport Clocks [Reading 9]
    	3/4/2020	Network Communication - TCP/UDP Lab [Lab 1a] [tcp_client.py] [tcp_server.py] [Lab 1b] [udp_client.py] [udp_server.py] [Lab 1c] [udp_multi_client.py] [udp_multi_server.py]
    	3/5/2020	Assignment 1 Work Session (Andrew Sick)
    Week 4	3/9/2020	Concurrency/Synchronization Lab I - Threads [Lab 2] [Assignment 1 soft due date]
    	3/11/2020	Concurrency, Time and Synchronization - Lamport Clocks [Reading 10] [FUN READING/TIMELY RECAP] [Lab 3] [Assignment 2]
    	3/12/2020	Consistency / Fault Tolerance/Expectance [Reading 11] 
    Week 5	3/16/2020	Concurrency/Synchronization Lab II - Locking/Synchronization [Reading 12 (through p.360 only)] [Reading 12a (optional)]
    	3/18/2020	LONG WEEKEND
    	3/19/2020	LONG WEEKEND
    Week 6	3/23/2020	PREPARATION WEEK
    	3/25/2020	PREPARATION WEEK
    	3/26/2020	PREPARATION WEEK
    Week 7	3/30/2020*	Reboot: Catch up on Readings/Labs thru 3/16
    	4/1/2020*	Reboot: Catch up on Readings/Labs thru 3/16
    	4/2/2020*	Internet-Scale Systems (Hamilton) finish discussion
    Week 8	4/6/2020*	Discuss Two-Phase Commit [Assignment 3]
    	4/8/2020*	Consensus - Paxos [Reading 13] [Reading 13a] [Two-Phase Commit Rodeo?]
    	4/9/2020*	Distributed Programming - MapReduce [Reading 14] [Reading 14a (optional)] [Assignment 2 soft-due]
    Week 9	4/13/2020*	Storage - NFS/GFS [Reading 15] [Reading 15a (optional)] [Reading 16]
    	4/15/2020*	P2P Systems - Bittorrent, Tor, Chord [Reading 17] [Reading 18] [Reading 19]
    	4/16/2020*	Distributed Systems in the Wild - Google, AutoPilot [Reading 20] [Reading 21] 
    Week 10	4/20/2020*	Trendy Topics - BitCoin/BlockChain [Reading 22] [Reading 22a (optional)]
    	4/22/2020	PLAN DAY / FACULTY MEETING - NO CLASS
    	4/23/2020*	Trendy Topics - Serverless Computing
    Week 11	4/27/2020*	Trendy Topics - Edge Computing, Distributed Scheduling
    	4/29/2020*	Project Kick-Off & Work Session
    	4/30/2020*	Project Work Session
    Week 12	5/4/2020*	Project Work Session (Readings 17/18 for today)
    	5/6/2020*	Project Work Session
    	5/7/2020*	Project Work Session
    Week 13	5/11/2020*	Project Work Session (Reading 19 for today)
    	5/13/2020*	Project Work Session
    	5/14/2020*	Project Work Session
    Week 14	5/18/2020*	Project Work Session (Readings 20/21 for today)
    	5/20/2020*	Project Work Session
    	5/21/2020*	Project Work Session
    Week 15	5/25/2020*	Project Work Session (Readings TBA for today - topics: BlockChain, Serverless, maybe Edge)
    	5/27/2020*	Project Wrap-Up / Tearful Goodbye
    
    * Online class due to global pandemic.