Reading List

    Complexity

    Historical Perspective

  1. Edsger W. Dijkstra. The Structure of the "THE" Multiprogramming System. Communications of the ACM 11(5), May 1968.
  2. Per Brinch Hansen. The Nucleus of a Multiprogramming System. Communications of the ACM 13(4), April 1970.
  3. Dennis M. Ritchie and Ken Thompson. The UNIX Timesharing System. Communications of the ACM 17(7), July 1974.
  4. David D. Redell, Yogen K. Dalal, Thomas R. Horsley, Hugh C. Lauer, William C. Lynch, Paul R. McJones, Hal G. Murray, and Stephen C. Purcell. Pilot: An Operating System for a Personal Computer. Communications of the ACM 23(2), February 1980, pp. 81-92.
  5. Lauer, H. C. Observations on the development of an operating system. In Proceedings of the Eighth ACM Symposium on Operating Systems Principles (Pacific Grove, California, United States, December 14 - 16, 1981). SOSP '81.
    Further reading:
  6. W. Wulf, E. Cohen, W. Corwin, A. Jones, R. Levin, C. Pierson, and F. Pollack. HYDRA: The Kernel of a Multiprocessor Operating System. Communications of the ACM 17(6), June 1974, pp. 337-344.
  7. R. Levin, E. Cohen, W. Corwin, F. Pollack, and W. Wulf. Policy/Mechanism Separation in Hydra. Proc. of the 5th Symposium on Operating Systems Principles, November 1975, pp. 132-140.
  8. Modern OS Structures

  9. Jeffrey Chase, Henry Levy, Michael Feeley, and Edward Lazowska. Sharing and Protection in a Single Address Space Operating System. ACM Trans. on Computer Systems, November 1994.
  10. Dawson R. Engler and M. Frans Kaashoek. Exterminate All Operating System Abstractions. Fifth Workshop on Hot Topics in Operating Systems (HotOS-V), Orcas Island, Washington, May, 1995
  11. Silas Boyd-Wickizer , Haibo Chen, Rong Chen, Yandong Mao, Frans Kaashoek, Robert Morris, Aleksey Pesterev, Lex Stein, Ming Wu, Yuehua Dai, Yang Zhang, Zheng Zhang. Corey: an operating system for many cores. To appear in OSDI 2008.
  12. Galen Hunt and James Larus. Singularity: Rethinking the Software Stack. Operating Systems Review, Vol. 41, Iss. 2, pp. 37-49, April 2007.
  13. Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. Xen and the Art of Virtualization Proc. of the 19th ACM Symp. on Operating Systems Principles, October 2003.
    1. Extra Material:

    Performance

    Memory Management

  14. Peter J. Denning. The working set model for program behavior. In Communications of the ACM 11(5), May 1968.Pages 323 - 333.
  15. Daley, R. C. and Dennis, J. B. 1968. Virtual memory, processes, and sharing in MULTICS. Commun. ACM 11, 5 (May. 1968), 306-312.
  16. Michael Young, Avadis Tevanian, Jr., Richard Rashid, David Golub, Jeffrey Eppinger, Jonathan Chew, William Bolosky, David Black, and Robert Baron. The Duality of Memory and Communication in the Implementation of a Multiprocessor Operating System. In 11th Symp. on Operating Systems Principles, pages 63--76, 1987.
  17. Carl Waldspurger. Memory Resource Management in VMware ESX Server in Proceedings of the 5th Symposium on Operating Systems Design and Implementation, 2002.
    1. Extra Material

    2. H. M. Levy and P. H. Lipman. Virtual memory management in the VAX/VMS operating system. Computer, 15(3):35--41, March 1982.
    3. Steven M. Hand. Self-paging in the Nemesis operating system In Proceedings of the third symposium on Operating systems design and implementation, pages 73-86, 1999.

    Communication

  18. Andrew D. Birrell and Bruce Jay Nelson. Implementing Remote Procedure Calls. ACM Trans. on Computer Systems 2(1), February 1984, pp. 39-59.
  19. Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. Lightweight Remote Procedure Call. ACM Trans. on Computer Systems 8(1), February 1990, pp.37-55.
  20. Thorsten von Eicken, Anindya Basu, Vineet Buch, Werner Vogels U-Net: A User-Level Network Interface for Parallel and Distributed Computing in Proceedings of the 15th ACM Symposium on Operating Systems Principles, Copper Mountain Resort, Colorado, December 1995, 40-53.
    1. Extra material:

    2. Michael D. Schroeder, Andrew D. Birrell and Roger M. Needham. Experience with Grapevine: The Growth of a Distributed System. ACM Trans. on Computer Systems, 2(1), February 1984.

    Scheduling

  21. Thomas Anderson, Brian Bershad, Edward Lazowska, and Henry Levy. Scheduler Activations: Effective Kernel Support for the User-Level management of Parallelism. ACM Trans. on Computer Systems 10(1), Feburary 1992, pp. 53-79.
  22. C. Waldspurger and W. Weihl. Lottery Scheduling: Flexible Proportional-Share Resource Management. Proceedings of the First USENIX Symposium on Operating System Design and Implementation, November 1994.
  23. Banga, G., Druschel, P,. Mogul, J. Resource Containers: A New Facility for Resource Management in Server Systems. Proceedings of the Third Symposium on Operating System Design and Implementation (OSDI-III), New Orleans, LA, February, 1999, pages 45-58.
  24. Concurrency

  25. Butler W. Lampson, David D. Redell Experiences with Processes and Monitors in Mesa Communications of the ACM, 23 2, February 1980, pp. 105-117.
  26. Thomas Anderson, Edward Lazowska, and Henry Levy. The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors. IEEE Trans. on Computers 38(12), December 1989, pp. 1631-1644.
  27. Dave Dice, Hui Huang, Mingyao Yang. Asymmetric Dekker Synchronization.
  28. Dave Dice Mark Moir William Scherer. Quickly Reacquirable Locks.
  29. Ulrich Drepper. Futexes Are Tricky . Jan. 2008.
  30. Christopher J. Rossbach, Owen S. Hofmann, Donald E. Porter, Hany E. Ramadan, Aditya Bhandari, Emmett Witchel. TxLinux: Using and Managing Transactional Memory in an Operating System. in Proc. of the 21st Symposium on Operating Systems Principles (SOSP), Oct. 2007
  31. File Systems and Storage

  32. Marshall K. McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry. A Fast File System for UNIX. ACM Trans. on Computer Systems 2(3), August 1984, pp. 181-197.
  33. Mendel Rosenblum and John K. Ousterhout. The Design and Implementation of a Log-Structured File System. ACM Trans. on Computer Systems 10(1), February 1992, pp. 26-52.
  34. Adam Sweeney, Doug Doucette, Wei Hu, Curtis Anderson, Mike Nishimoto, and Geoff Peck. Scalability in the XFS File System Proceedings of the USENIX 1996 Technical Conference.
  35. John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Stayanayanan, Robert N. Sidebotham, and Michael J. West. Scale and Performance in a Distributed File System. ACM Trans. on Computer Systems 6(1), February 1988, pp. 51-81.
  36. Russel Sandberg, David Goldberg, Steve Kleiman, Dan Walsh, and Bob Lyon. Design and Implementation of the Sun Network Filesystem.Proceedings of the Summer 1985 USENIX Conference, Portland OR, June 1985, pp. 119-130.
  37. David A. Patterson, Garth Gibson, and Randy H. Katz. A Case for Redundant Arrays of Inexpensive Disks (RAID) Proceedings of the 1988 ACM SIGMOD Conference on Management of Data, Chicago IL, June 1988.
  38. E. K. Lee and C. A. Thekkath. Petal: Distributed virtual disks. In Proc. 7th Int. Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS) , pages 84--92, October 1996.
  39. Chandramohan Thekkath, Timothy Mann, and Edward Lee. Frangipani: A Scalable Distributed File System . Proc. of the 16th ACM Symposium on Operating Systems Principles, October 1997, pages 224-237.
  40. Garth A. Gibson, David F. Nagle, Khalil Amiri, Jeff Butler, Fay W. Chang, Howard Gobioff, Charles Hardin, Erik Riedel, David Rochberg and Jim Zelenka. A cost-effective, high-bandwidth storage architecture . Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, 1998, pp. 92--103
  41. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google File System . in 19th ACM Symposium on Operating Systems Principles, Lake George, NY, October, 2003.
    1. Extra Material:

    2. Sun Microsystems ZFS
    3. Gibson, G., Nagle, D., Amiri, K., Chang, F., Feinberg, E., Gobioff, H., Lee, C., Ozceri, B., Riedel, E., Rochberg, D. and Zelenka, J. File Server Scaling with Network-Attached Secure Disks in Proceedings of Proc. of the ACM International Conference on Measurement and Modeling of Computer Systems (Sigmetrics), 1997.

    Reliability


    Principles

  42. Jim Gray. Why Do Computers Stop and What Can Be Done About It? . Tandem Tech Report TR-85.7, June 1985.
  43. Approaches

  44. Robert Haskin, Yoni Malachi and Gregory Chan. Recovery Management in QuickSilver. ACM Trans. on Computer Systems 6(1), February 1988, pp. 82-108.
  45. Werner Vogels, Dan Dumitriu, Ken Birman, Rod Gamache, Mike Massa, Rob Short, John Vert, Joe Barrera. The Design and Architecture of the Microsoft Cluster Service -- A Practical Approach to High-Availability and Scalability in Proceedings of the Fault-Tolerant Computing Symposium, 1998.
  46. Y. Saito, B. Bershad and H. Levy. Manageability, Availability and Performance in Porcupine: A Highly Scalable Internet Mail Service. Proc. of the 17th ACM Symp. on Operating Systems Principles, Dec. 1999.
  47. Michael M. Swift, Brian N. Bershad, and Henry M. Levy. Improving the Reliability of Commodity Operating Systems. in Proceedings of the 19th ACM Symposium on Operating Systems Principles, Oct. 2003.
  48. Kinshuk Govil, Dan Teodosiu, Yongqiang Huang, and Mendel Rosenblum. Cellular Disco: resource management using virtual clusters on shared-memory multiprocessors. In Proceedings of 17th Symposium on Operating Systems Principles, 1999.
  49. Robert Wahbe, Steven Lucco, Thomas E. Anderson, and Susan L. Graham, Efficient Software-Based Fault Isolation. In Proc. of the 14th ACM Symposium on Operating Systems Principles (SOSP '93), pages 203--216, December 1993.
  50. Anita Borg, Wolfgang Blau, Wolfgang Graetsch, Ferdinand Herrmann, and Wolfgang Oberle. Fault tolerance under UNIX. ACM Transactions on Computer Systems, 7(1), February 1989.
  51. T. C. Bressoud and F. B. Schneider. Hypervisor-based fault tolerance in Proceedings of the fifteenth ACM symposium on Operating systems principles, Pages 1 - 11, Oct. 1995
    1. Extra Material:

    2. Nancy P. Kronenberg, Henry M. Levy and William D. Strecker. VAXclusters: A Closely-Coupled Distributed System. In ACM Trans. Comput. Syst., 4(2), 1986
    3. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. Bugs as deviant behavior: a general approach to inferring errors in systems code. In Proceedings of the 18th ACM Symposium on Operating Systems Principles, pages 57--72, Banff, Alberta, Canada, October 21--24, 2001.
    4. Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson Engler. An empirical study of operating systems errors. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP18), pages 73--88, October 2001.
    5. Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem. Checking System Rules Using System Specific, Programmer-Written Compiler Extensions. Proceedings of the Fourth Symposium on Operating Systems Design and Implementation, San Diego, CA, October 2000.
    6. Frank Schmuck and Jim Wyllie. Experience with Transactions in Quicksilver. Proceedings of the 12th ACM Symposium on Operating Systems Principles (Asilomar, CA; Oct. 1991)

    Security

  52. Saltzer, J. H. Protection and the control of information sharing in Multics. Commun. ACM 17, 7 (Jul. 1974), 388-402.
  53. Robert Morris and Ken Thompson. Password security: A case history. Communications of the ACM, 22(11):594--597, 1979.
  54. Roger M. Needham and Michael D. Schroeder. Using Encryption for Authentication in Large Networks of Computers. Communications of the ACM 21(12), December 1978, pp.993-998.
  55. J. G. Steiner, C. Neuman and J. I. Schiller. Kerberos: An Authentication Service for Open Network Systems. USENIX '88, Dallas, TX, February 1988, pp. 191-202.
  56. Edward Wobber, Martín Abadi, Mike Burrows, Butler Lampson. Authentication in the Taos Operating System Proceedings of the 14th ACM Symposium on Operating System Principles, 1993.
  57. Jonathan M. McCune, Adrian Perrig, Michael K. Reiter. Bump in the Ether: A Framework for Securing Sensitive User Input In Proceedings of the USENIX Annual Technical Conference, June 2006.
  58. Tal Garfinkel, Ben Pfaff, Jim Chow, Mendel Rosenblum, and Dan Boneh. Terra: a virtual machine-based platform for trusted computing. In Proceedings of the nineteenth ACM symposium on Operating systems principles, pages 193--206. ACM Press, 2003.
  59. Ross J. Anderson, Why Information Security is Hard -- An Economic Perspective. in Proceedings of the Seventeenth Computer Security Applications Conference, IEEE Computer Society Press (2001), pp 358--365.
  60. Ken Thompson. Reflections on Trusting Trust. Communications of the ACM, vol. 27, pp. 761--763, August 1984.
    1. Extra Material

    2. Butler Lampson, Protection . In Proceedings of the 5th Annual Princeton Conference on Information Sciences and Systems, 1971.
    3. Jerome H. Saltzer and Michael D. Schroeder. The protection of information in computer systems. Proceedings of the IEEE, 63(9):1278--1308, September 1975.
    4. Butler Lampson, Martín Abadi, Mike Burrows, and Edward Wobber. Authentication in distributed systems: Theory and Practice. In Proceedings of the 13th ACM Symposium on Operating System Principles, pages 165-182, 1991.

    Manageability


  61. R. Chandra, N. Zeldovich, C. Sapuntzakis, and M. S. Lam. The Collective: A Cache-Based System Management Architecture . In Proceedings of the Second Symposium on Networked Systems Design and Implementation (NSDI 2005), pages 259-272, May 2005.
  62. Y. Saito, B. Bershad and H. Levy. Manageability, Availability and Performance in Porcupine: A Highly Scalable Internet Mail Service. Proc. of the 17th ACM Symp. on Operating Systems Principles, Dec. 1999.
  63. Armando Fox, Steven Gribble, Yatin Chawathe, Eric Brewer, and Paul Gauthier. Cluster-based Scalable Network Services . Proc. of the 16th ACM Symp. on Operating Systems Principles, October 1997, pp. 78-91.
  64. `
  65. Jeffrey O. Kephart and David M. Chess. The vision of autonomic computing. IEEE Computer 36(1):41--52, 2003.
  66. David Patterson, Aaron Brown, Pete Broadwell, George Candea, Mike Chen, James Cutler, Patricia Enriquez, Armando Fox, Emre Kiciman, Matthew Merzbacher, David Oppenheimer, Naveen Sastry, William Tetzlaff, Jonathan Traupman, and Noah Treuhaft. Recovery Oriented Computing (ROC): Motivation, Definition, Techniques, and Case Studies. Technical Report CSD-02-1175, U.C. Berkeley, 2002.
  67. Yixin Diao, Joseph L. Hellerstein, Sujay Parekh, Rean Griffith, Gail Kaiser and Dan Phung. Self-Managing Systems: A Control Theory Foundation.12th IEEE International Conference and Workshops on the Engineering of Computer-Based Systems (ECBS'05) pp. 441-448, 2005.
  68. Aaron B. Brown and David A. Patterson. Undo for operators: Building an undoable e-mail store. In Proc. USENIX Annual Technical Conference, San Antonio, TX, 2003.
    1. Extra material

    2. Jeffrey O. Kephart and David M. Chess. The vision of autonomic computing. IEEE Computer 36(1):41--52, 2003.

    Great thoughts on systems's research

  69. Jim Waldo. On System Design Sun Microsystems, 2006.
  70. Butler W. Lampson. Hints for Computer System Design. CM Operating Systems Rev. 15, 5 (Oct. 1983), pp 33-48. Reprinted in IEEE Software 1, 1 (Jan. 1984), pp 11-28.
  71. Roy Levin and David D. Redell. An Evaluation of the Ninth SOSP Submissions, or, How (and How Not) to Write a Good Systems Paper. ACM SIGOPS Operating Systems Review, Vol. 17, No. 3 (July, 1983), pages 35-40.