Welcome to graduate operating systems

Welcome to graduate operating systems! This course will cover an exciting range of materials from the broad field of operating systems, including basic operating system structure, communication, memory management, reliability, file systems and storage, virtual machines, security, and manageability. We will examine influential historical systems and important current efforts, extracting lessons both on how to build systems as well as how to evaluate them.

Introductions

Say who they are, why are you here - what in particular would you like to learn about, and an outside interest.

Long view of operating systems:

1. What are they?

2. What do they do?

3. How are they built?

4. Why do we have Linux and Windows?

5. What services do they provide?

6. What are core problems?

7. What are the general techniques for solving these problems?

8. How do you evaluate them?

Class Organization

Four sections:

1. OS organization. What is the right organization: kernels, protection, extensibilty

2. Performance. Memory mgmt, scheduling, communication, storage

3. Reliabilty: clusters, fault tolerance

4. Security: authentication, authorization, trusted computing

Question: are there particular topics people want to spend more time?

Question: my nature is to look back at the original papers on a topic - more meat. We can sacrifice some breadth of knowledge to read a modern paper if people would like.

Note: Systems seminar Monday afternoons at 4 pm - organizational meeting next week.

Project

In this course, you will be completing two projects: a warm-up project and a main project. More information is available on the project page. The main project will be done in groups of 2 or 3, with the same grade for all participants.

Exams

There will be two midterm exams, each one covering one half the semester and of equal grading weight. You should be familiar with the papers and be able answer the questions from the reading section about those papers.

Grading

· 25% Reading and class participation

· 15% First exam

· 15% Second exam

· 10% Warm-up Project

· 35% Project

Late assignments will be docked 10% per day late. If you know you will be gone for a conference or interview, contact me early. In general, I give enough time for assignments that they can be done early.

Reading

The reading schedule for this course will be intense. The readings are grouped into four major categories: complexity, performance, reliability, and security. Within each category, we will read papers that address that issue in the context of different portions of an operating system.

You should form a discussion group for talking about the papers. You should have three or four people in your group, and discuss each paper sometime before class meets.

It is up to you if you want to meet just once a week or twice a week before each class, but you should discuss each paper. When you have formed a group, please send me

email with a list of group members.

When multiple papers are due, you need only write one review, but it is up to you which one you write

When discussing each paper, you are encouraged to consider the following questions: As you read, here are some questions you should consider:

1. What problem are the authors trying to solve?

· Why was the problem important?

· Why was the problem not solved by earlier work?

2. What is the authors solution to the problem?

· How does their approach solve the problem?

· How is the solution unique and innovative?

· What are the details of their solution?

3. How do the authors evaluate their solution?

· What specific questions do they answer?

· What simplifying assumptions do they make?

· What is their methodology?

· What are the strengths and weaknesses of their solution?

· What is left unknown?

4. What do you think?

· Is the problem still important?

· Did the authors solve the stated problem?

· Did the authors adequately demonstrate that they solved the problem?

5. What future work does this research point to?

You should be prepared to discuss these questions in class.

1. Good approach:

i. Skim paper for major ideas

ii. Read in depth to understand:

1. What was the motivation of the paper?

2. What are they key problems they are solving?

3. What is their solution? How does it really work?

4. How do they evaluate their solution?

iii. Things to think about:

1. What are the assumptions? Are they valid?

2. Many papers are historic; where did they go wrong? Why?

Responsibilities

For every lecture, you will be reading one or two research papers. For each paper, you have four responsibilities:

1. Read the paper

2. Discuss the paper with your group

3. Submit a short writeup about the paper

4. Prepare to summarize the paper in class

Writeup

Before 10:00 am on day of class, please post your review to the blog at:
http://www.cs.wisc.edu/~cs736-1/blog
Your posting should contain:

1. A one or two sentence summary of the paper

2. A one description of the problem they were trying to solve

3. A summary of the contributions of the paper

4. The one or two largest flaws in the paper

The writeup should not be more than a page in length. Late writeups will receive a zero grade.

Writeup Grading

What I’m looking for:

· Does the review include all sections (summary, problem, contributions, flaws, relevance)

· Are all assertions backed up (e.g. “X is a bad idea” is not acceptable, but “X is a bad idea because Y”) is acceptable

· Is the review concise? The summary should be a few sentences and give the essence of the design in the paper, not the problem. (E.g., “This paper is about how to build a multiprocessor operating system” is not acceptable, but “This paper is about building a multiprocessor operating system by layering abstractions that mask the existence of multiple processors” is acceptable)

· Did the student understand the material? Are there factual flaws in the review? For example, if the paper defines a term, does the student use it appropriately? As another example, if students state that a paper is relevant because modern operating systems do things the same way, is that true?

Intro to OS:

1. What is an OS?

1. What is an OS?

1. (ASK CLASS)

2. Wikipedia: “In computing, an operating system (OS) is the system software responsible for the direct control and management of hardware and basic system operations. Additionally, it provides a foundation upon which to run application software such as word processing programs and web browsers.”

3. Dictionary.com: “<operating system> (OS) The low-level software which handles the interface to peripheral hardware, schedules tasks, allocates storage, and presents a default interface to the user when no application program is running.”

4. Computerhope.com: “An Operating System, or OS, is a software program that enables the computer hardware to communicate and operate with the computer software. Without a computer Operating System, a computer would be useless.”

5. “The collection of functions which make available computing services to people who require them”

6. “The purpose of an OS is to provide an environment where a user can execute programs. The primary goal of an OS is thus to make the computer system convenient to use. A secondary goal is to use the computer hardware in an efficient manner.”

7. Alan Creak (prof, Auckland University) “The OS is the collection of functions, mostly implemented in software, that people want to use but which are not implemented anywhere else. In effect, the OS does everything that everyone wants, but no one else is prepared to do.”

8. Bill Wulf (Hydra): “An OS provides a ‘virtual machine’ which is more hospitable than the base hardware for two reasons: (a) it make available certain ‘virtual resources’ such as files, directories, virtual memory, etc. absent from the base hardware. (b) it masks certain unpleasant hardware features, such as interrupts, from the user and maps them into more acceptable ones, such as P-V synchronization primitives

9. Bill Wulf (Hydra): an OS manages the physical resources of the computer, suchas primary memory, processor, channels, etc. so as to improve their utilization

10. Peter Denning: “A computer system may be defined in terms of the various supervisory and control functions it provides for the processes created by its users: 1) creating and removing processes, 2) controlling progress of processes, 3) acting on exception conditions arising during the execution of a process, 4) allocating hardware resources among processes, 5) providing access to software resources, e.g. files, editors, compilers, assemblers …, 6) providing protection, access control, and security for information, and 7) providing interprocess communication where required. These functions must be provided by the system because they cannot be handled adequately by the processes themselves. The software that assists the hardware in implementing these functions is the OS.

11. Major abstractions

i. Resource manager:

1. Efficiently multiplex cpu, memory, devices

ii. Protection

1. Isolate and safely share between programs

iii. Hardware services

1. Simplify access to cpu, memory, devices

1. E.g. automatic swapping vs. overlays

2. Stream/message based device access vs. dma/programmed i/o

iv. Higher-level services

1. Communication

1. Ordered, reliable message delivery vs. unreliable packets (TCP/IP)

2. Storage

1. Hierarchical, named storage vs. blocks (file systems

3. Protection

1. Safe sharing / communication

4. Users / programs use abstract virtual machine instead of real hardware -> masks hardware differences

v. OS provides interfaces to:

1. Hardware (e.g. device drivers)

2. Applications (e.g. system calls)

3. User (e.g. cmd line, gui)

4. System manager (e.g. registry, scripts. .conf/.ini files)

2. Why are OS’s hard to build?

1. Managing competing demands under finite resources

2. Managing unknowable / unpredictable demands

3. Widely differing usages (e.g. Linux used on wristwatch, in google cluster, on ascii Q supercomputer, on desktop)

4. Balance performance and programmability (for app and kernel devs)

i. Abstract away complexity of hardware

ii. Maintain performance of hardware despite abstraction

3. Key themes in Operating system development

1. Complexity: simplifying, adding structure, or removing complexity from system to make it easier to understand / adapt

2. Performance: making existing applications run faster by removing unnecessary code/ providing better access hardware / better compensation for HW problems

3. Reliability: making it work under failure conditions

4. Security: preventing bad things from happening

5. Manageability: removing human element / reducing human element in operations

2. Brief history of OS

1. Eniac - programmed by wires

1. Team does all maintenance, programming

2. No languages other than machine code (not even assembly)

2. Early 50s: card punches w/ libraries

1. Subroutines invented

2. device drivers

3. common library routines

3. Batch systems. Goal: maximize utilization

1. Originally: one job at a time from cards

1. Lead to low utilization due to switching jobs

2. Solution: batching

1. Submit stacks of jobs, computer will automatically pick up next one and execute

3. later: spool into memory, high speed disk storage

1. Avoid need to batch things to tape

2. Avoid need to read from slow tape to get next job

4. Later: Context switching to overlap I/O with execution

5. Protection to prevent one job from colliding with another

6. The birth of modern OS: scheduling, memory management, protection, synchronization, communication

7. Lead to interesting problems:

1. scheduling for throughput, turnaround time, etc.

2. Protection between jobs

3. File systems for storing data (as opposed to tape / cards)

4. Memory mgmt to prevent jobs from colliding

5. Shared libraries for memory savings

6. Paging / virtual memory to simplify programming

4. Time sharing

1. For interactive computing, developed at Darmouth and MIT

2. Key idea: humans aren't 100% utilization

1. Spend time thinking, talking, drinking coffee, can't type that fast

3. Idea: split time into small pieces, give one piece to each user - illusion that each person has a computer to themselves

4. Relies on connected terminals as compared to a single master console

5. IBM computers couldn't do it for years

6. Lead to many key problems

1. Scheduling for interactive users

2. Security: sharing / protection between users

5. Mini computers

1. Started off recapitulating mainframe OS

2. Started having interesting operating systems by 1973 w/ Unix and VMS

1. Once they got powerful enough, enough applications

3. Use of high-level languages for writing operating systems

6. Big systems

1. Sage - strategic air command, real time system for monitoring attacking bombers, built in 50s

1. 55,000 tubes, 3 mega watts, 275 tons per computer, 23 computers

2. 1 million lines of code, 7000 programmers

3. 8-12 billion dollars in 1964, more than Manhattan project

4. Display moving targets on operators screen

2. Sabre - American Airlines reservation system. Could handle thousands of transactions on a very small machine over dialup lines

3. Multics - vision of a computer utility. One big computer (or more) per city served by dialup lines

1. Designed to serve 100's of users on a 386-class machine

2. OS/360

1. Years latet

2. 5000 person years of work

3. Millions of dollars

4. Birth of complexity in OS

7. Micro computers

1. Started having interesting operating systems in 1993 with Windows NT

2. Big ideas:

1. single use computer (not need protection / sharing / security)

2. User interface / graphics is key -- need OS support

3. Mini computer OS + micro computer size --> high end workstations: SUN, Silicon graphics, etc. based on Unix

4. Linux popularized brought Unix features on PC-class machines (but MS sold Xenix, a version of Unix for PC's much earlier)

8. Networking

1. Driven by: arpanet in the early 70s, ethernet in the late 70s

2. Unsolicited packets

3. Very different device model from block, character

4. Lead to distributed sytems

1. Early goal: virtual mainframe. Tie together network of computers to make one big computer

1. E.g. ps lists all processes on all computers.

2. E.g. process migration to lightly loaded machine

2. Lots of interesting problems due to network:

1. Lack of synchronized time

2. Failure-independence (no fate sharing)

3. Different trust boundaries (e.g. may not trust network)

3. Later ended up being distributed applications (web, file servers, email) + stand alone OS + networking

9. Multiprocessing

1. Multithreading

2. Synchronization in usermode

10. Multimedia

1. Soft real-time scheduling

2. Streaming file/data access

11. Internetworking / world wide web

1. Security

2. Network configuration

12. Embedded computing

1. Ipod, cellphone

2. Tickle-me-elmo

13. Virtual machines

1. Invented in 60's to provide time sharing on IBM OS that only did batch

2. Investigated in early 70's on minicomputers

3. Renaissance in 90's from Mendel Rosenblum at Stanford: built VM for Mips / Irix, then founded VMware for PC-based virtual machines

14. Real time computing

1. Factory automation

15. Fault tolerant computing

1. Banking / financial services

2. Air traffic control

3. Avionics / Automotive control

16. Sensor networks

1. Small - 8 kb ram, 8 MHz processor, single battery to power it for days / weeks / months

17. Things to note:

1. Advances in operating systems are often driven by changes in underlying technology

2. OS technology often starts in mainframes, repeated in minicomputers, repeated in personal computers, repeated in pdas, repeated in sensor networks

1. Multiprogramming

2. virtual memory

3. protection

4. networking