Instructor
- who: Michael Swift
- where: Room 7369
- when: Wed. 2:30-3:30, Thu. 1:30-2:30
- email: swift 'at' cs.wisc.edu
Lecture:
- when: Tues./Thur. 11-12:15
- where: Computer Sciences 1263
- list: compsci739-1-s11 'at' lists.wisc.edu
HomePage
Resources
edit SideBar
|
Note
Many of these files are under copyright so they cannot be distributed to the whole internet. As a result, access is limited to hosts on the wisc.edu network. If you want to access these files from another network, such as from home, you have two options:
- Use google to search for an accessible copy of the file
- Use WiscVPN to connect to the campus network.
Introduction
- Distributed Systems Background
- Sample System
Scalability
- Request distribution
- Locality-Aware Request Distribution, Vivek Pai, Guarav Banga, ASPLOS-VIII
- Background: The Power of Two Choices in Randomized Load Balancing. Michael Mitzenmacher. IEEE Trans. on Parallel and Distributed Systems, October 2001.
- Background: How Useful is Old Information? Michael Mitzenmacher. IEEE Trans. on Parallel and Distributed Systems. January 2000.
- Background: Interpreting Stale Load Inforamtion. Michael Dahlin. IEEE Trans. on Parallel and Distributed Systems. October 2000.
- Karger, D.; Sherman, A.; Berkheimer, A.; Bogstad, B.; Dhanidina, R.; Iwamoto, K.; Kim, B.; Matkins, L.; Yerushalmi, Y. (1999). "Web caching with consistent hashing". Computer Networks 31 (11): 1203–1213.
- Background: Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. David Karger, Eric Lehman, Tom Teighton, Matthew Levine, Daniel Lewin, Rina Panigrahy. STOC 1997.
- Background: Algorithms in the Real World: Consistent Hashing. Bruce Maggs
- Large-scale services
Distributed Operating Systems
- Designs
- Processor Pools
- D. Thain, T. Tannenbaum and M. Livny, "Distributed Computing in Practice: The Condor Experience", Concurrency and Computation: Practice and Experience 17, 2-4, February-April 2005, pp. 323-356
- Background; Douglas Thain and Miron Livny, Building Reliable Clients and Servers, in Ian Foster and Carl Kesselman, editors, The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann, 2003, 2nd edition. ISBN: 1-55860-933-4
- Background: Michael Litzkow, Todd Tannenbaum, Jim Baseny and Miron Livny. Checkpoint & Migration of UNIX Processes in the Condor Distributed Processing System,University of Wisconsin-Madison Computer Sciences Technical Report #1346, 1997
Consistency
- Time and Ordering
- L. Lamport, Time, Clocks, and the Ordering of Events in a Distributed System, Communications of the ACM, July 1978, pages 558-564.
- Background: Network Time Protocol
- Background: Detection of Mutual Inconsistency in Distributed Systems D. S. Parker, G. J. Popek, G. Rudisin , A. Stoughton, B. J. Walker, E. Walton, J. M. Chow, D. Edwards, S. Kiser, C. Kline. IEEE Transactions on Software Engineering, Volume 9 Issue 3, May 1983/
- M. Chandy and L. Lamport. Distributed snapshots: determining global states of distributed systems. ACM Trans. Comput. Syst., 3(1):63-75, 1985.
- Byzantine Failures
- L. Lamport, R. Shostak, and M. Pease, The Byzantine Generals Problem, ACM Transactions on Programming Languages and Systems, July 1982, pages 382-401.
- Practical Byzantine Fault Tolerance;Miguel Castro and Barbara Liskov,OSDI'99
- Yoram Moses, Danny Dolev, Joseph Y. Halpern. Cheating husbands and other stories (preliminary version): a case study of knowledge, action, and communication, Proceedings of the fourth annual ACM symposium on Principles of distributed computing, 1985
- Process Groups
Replication
- Demers et al., Epidemic algorithms for replicated database maintenance, PODC 1987.
- Dynamo: Amazon's Highly Available Key-Value. Store
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swami Sivasubramanian, Peter Vosshall and Werner Vogels
Proceedings of the 21st ACM Symposium on Operating Systems Principles, Stevenson, WA, October 2007.
Agreement
- The Part-Time Parliament. Leslie Lamport; ACM Transactions on Computer Systems, Vol. 16, No. 2, May 1998
- Leslie Lamport. Paxos Made Simple. ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001) 51-58.
- Tushar Chandra, Robert Griesemer, and Joshua Redstone. Paxos Made Live – An Engineering Perspective. PODC '07: 26th ACM Symposium on Principles of Distributed Computing, 2007.
Storage
- J. J. Kistler and M. Satyanarayanan, Disconnected Operation in the Coda File System, Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, October 13-16, 1991, pages 213-225.
- E. K. Lee and C. A. Thekkath. Petal: Distributed virtual disks. In Proc. 7th Int. Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS) , pages 84--92, October 1996.
- Chandramohan Thekkath, Timothy Mann, and Edward Lee. Frangipani: A Scalable Distributed File System. Proc. of the 16th ACM Symposium on Operating Systems Principles, October 1997, pages 224-237.
Advanced Topics
- C. Amza, A.L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel, TreadMarks: Shared Memory Computing on Networks of Workstations IEEE Computer, Vol. 29, No. 2, pp. 18-28, February 1996.
- I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In SIGCOMM '01: Proceedings of the 2001 on Applications, technologies, architectures, and protocols for computer communications, 2001. ACM.
conference
Cloud Computing
- Brian Hayes. Cloud Computing. Communications of the ACM, Volume 51, Issue 7 (July 2008). Pages 9-11.
- Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy H. Katz, Andrew Konwinski, Gunho Lee, David A. Patterson, Ariel Rabkin, Ion Stoica and Matei Zaharia. Above the Clouds: A Berkeley View of Cloud Computing. EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2009-28
February 10, 2009
- Luiz André Barroso and Urs Hölzle. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, 2009.
Data Manipulation Models
- MapReduce: Simplified Data Processing on Large Clusters.
Jeffrey Dean and Sanjay Ghemawat
OSDI'04
- MapReduce and parallel DBMSs: friends or foes? Michael Stonebraker, Daniel Abadi, David J. DeWitt, Sam Madden, Erik Paulson, Andrew Pavlo, Alexander Rasin. Communications of the ACM, Volume 53 , Issue 1 (January 2010), Pages: 64-71.
- MapReduce: a flexible data processing tool Jeffrey Dean and Sanjay Ghemawat. Communications of the ACM, Volume 53 , Issue 1 (January 2010). Pages: 72-77.
- DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Ulfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey. Symposium on Operating System Design and Implementation (OSDI), San Diego, CA, December 8-10, 2008.
Cloud Infrastructure
Cloud Scheduling
- Quincy: Fair Scheduling for Distributed Computing Clusters. Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, and Andrew Goldberg, October 2009.
- Resource overbooking and application profiling in shared hosting platforms Bhuvan Urgaonkar, Prashant Shenoy and Timothy Roscoe. OSDI 2002
- Managing energy and server resources in hosting centers. Jeffrey S. Chase, Darrell C. Anderson, Prachi N. Thakar, Amin M. Vahdat and Ronald P. Doyle. SOSP 2001.
- Making Scheduling ``Cool'': Temperature-Aware Workload Placement in Data Centers, by J. Moore, J. Chase, P. Ranganathan. In the 2005 USENIX Annual Technical Conference (USENIX '05), April 2005.
- Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. Zaharia, Matei, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleey, Scott Shenker and Ion Stoica. Eurosys 2010
Cloud Security
|