CS 736 – Fall 2006
Lecture 4 – Pilot
i. 64 bits – is it enough? 4 billion disks, 4 billion files?
ii. Context: disks at the time were 4 mb, computers had a life span of 5 years
i. Not for end-user programmer, but for Xerox systems programmers (not open system)
i. 4 types of systems
1. Designed to be used by designers: Alto, Unix
a. Not try to incorporate all good ideas from last few years
b. Not try to scale arbitrarily; just to size of user community
2. Planned systems: designed to do everything, big goals: OS/360, Multics, Pilot
3. Replace portion of existing system: Linux
4. Research vehicles, never designed for use but instead to try ideas: Hydra
5. Small, uninteresting: donÕt push any ideas, no impact
i. Lots of focus given on interfaces – allows co-development of implementation and clients
ii. Strong type checking of interfaces including revisions
1. More full recompilations
iii. Access to hardware
1. Need language support for writing an OS
a. Type coercion from HW types (e.g. exception record, IP packet) to language types
b. Need inline assembly for machine features not available through language
i. Single user
1. QUESTION:
a. How did single-user design impact Pilot
b. What were key techniques to reduce complexity of the OS?
i. One method for vm allocation / swapping / file i/o
ii. Single-language system
iii. Monitors for synchronization
iv. Manager/kernel pattern
v. No strong protection/security
vi. QUESTION: What about that decision?
2. QUESTION: Is single user a reasonable assumption any more? When? What about only 1 simultaneous user?
ii. Single address space
iii. No hardware protection
iv. Micro-coded processor – could put lots of functionality (e.g. scanning ready queue) into processor
v. Tightly integrated language & OS
1. Single language support - MESA
2. Many OS features provided at language level – e.g. synchronization, scheduling
3. OS written in Mesa
4. QUESTION: Does this matter?
a. MODEL: all SW written by Xerox, not an open system
b. NOTE: MSR is doing this with Singularity, but at MSIL level (effectively language)
c. QUESTION: How does this impact security?
i. Defensive security: avoid accidental bugs
1. Not handle Trojan horses
2. QUESTION: is this a ÒflawÓ or ÒproblemÓ
a. There were no Trojans or viruses when thtis was written
b. There wasnÕt much of an internet
c. Start single-user OS at the time (Apple, IBM PC, Commodore, Atari, Northstar) had even less
ii. Cooperative programs
1. DonÕt need fair-distribution of scarce resources
iii. QUESTION: Small kernel or large kernel?
i. Process
1. Like a thread
2. Directly invokes trusted code
ii. File
1. Numbered block-sized space on disk
2. Accessible only through memory mapping
iii. Volume
1. Logical collection of blocks that can be allocated for files
iv. Stream
1. Sequence of bytes
v. Space
1. Range of addresses that can be treated as a unit or subdivided
a. Swapping
b.
i. File system
1. File = set of sequential pages
a. Named with 64 bit unique identifier
i. = machine ID + Time
ii. QUESTION: Privacy implications?
iii. QUESTION: Performance implications? How long does clock run for? How fast can you create files?
iv. REFERENCE: UUIDs on Windows
v. NOTE: like Unix inodes but w/o directories. Add directories on top
b. Flat name space – directories provided at higher level as a naming feature
c. QUESTION: What is consequence?
i. QUESTION: Why?
ii. QUESTION: do you need file names?
iii. QUESTION: how else could you group things besides directories?
d. Has attributes
i. Size – in pages
1. NOTE: not byte oriented, but record oriented
ii. Type = tag of type for invoking recovery
iii. Permanence – distinguishes unnamed temp files from files in directories
iv. Immutability – indicates file never changes, can be cheaply replicated
e. QUESTION: Why this set? Why name files with IDs
i. Why is naming provided elsewhere?
1. Only needed on file map operation
2. Common operations (swap in/swap out deal only with file handles)
3. Without need for protection, directories are pure naming, might as well handle at a higher layer
ii. ISSUE: how do safe update (e.g. pre-write files elsewhere, then swap, then delete)?
f. QUESTION: what is impact of having flat file space, directories are separate?
i. A: makes things like ls –l slow, e.g. ntfs puts attributes in directory
g. Memory mapped vs. read-write calls
i. QUESTION: What is benefit of mem-mapped? What about read/write calls?
ii. Memory mapped good for code, limited-size data files with lots of random access
iii. Read/write good for large files (bigger than address space) of unknown size
2. Volume = logical set of blocks
a. Can be portion of a disk or multiple d isks
b. Can be mounted / unmounted
3. Reliability
a. Problem: consistency of FS after crash
b. QUESTION: what is solution?
i. Solution: redundant information for checking failures
1. Label blocks with identifiers
2. File map maps identifiers to file blocks
3. Can reconstruct map by scanning disk
c. COMMENT: general reliability approach here: redundancy can be used to verify / reconstruct information
d. QUESTION: what else? Why make fsck easier? Why not soft updates / journaling?
i. A: disks small, fsck is pretty fast on a small disk
ii. Virtual Memory
1. QUESTION: What is key abstraction? What does it do? What does it unify?
2. QUESTION: Why do you need this level of control? How might you use it?
3. NOTE: used for self-management of VM, swapping by application (separation of policy/mechanism)
4. NOTE: machine address space is 2^32 words = 8 GB. Probably had 1 MB of memory, so plenty of address space to reduce fragmentation
5. Space
a. Unit of mapping
i. Attach to backing file
ii. Used for file-io (no read/write)
iii. Read/write file privileges map onto memory access
iv. Only one space in a hierarchy can be mapped
v. SEE COMMENT ABOVE
b. Unit of swapping
i. Can do page-sized for demand paging, or segment-sized
ii. QUESTION: why demand paging lead to thrashing? What is alternative
iii. Can map at one level, but swap at another
c. Hierarchical organized
i. Spaces created by sub-allocating existing spaces
ii. Only one parent of a byte can provide mapping to disk
d. IDEA: supporting hinting ***
i. Can tell the OS what data is grouped together
1. QUESTION: is this good? Should it be mandatory? Can programmers do this? Why not present any more?
2. QUESTON: is this applicable to multi-user OS?
ii. Can advise OS that space is coming or going with Space.Activate and Space.Deactivate
iii. COMMENT: This is another major comment in OS design:
1. Commonly, OS doesnÕt have complete information about future
2. Hints provide a way to tell OS what is going to happen
3. DoesnÕt guarantee anything (e.g. not impact correctness) but allows OS to make better decisions
a. e.g. flags to Windows CreateFile indicating access patterns, prefetching APIs
e. QUESTION: why do you need these?
i. Said demand paging led to thrashing
ii. Assume programmer provides hints to system by grouping pages together
6. Problem: Good VM system discourages miserliness with memory; makes it hard to build a small system because not painful enough
i.
iii. Streams / I/o
1. I/o devices present procedural interfaces unique to device (not direct access to device registers)
2. Transducer converts i/o device interface into stream (think Unix pipe)
3. Filters add additional functionality onto pipe (e.g. compression, encryption, translation)
4. Comment: Filters mix poorly in an O-O system
a. Modules naturally have richer, procedural interface
b. Streams require parsing
5. COMMENT: see above. DidnÕt work well in a procedurally oriented system.
6. QUESTION: Say protection not required except for devices used by Pilot kernel. What do they mean?
a. No isolation between processes
iv. Communication
1. layered network / sockets / NetworkStreams
2. QUESTION: do you need better resource allocation as a result
a. A: not really. Pilot is seen as the client OS, not for running servers with multiple clients
b. Used for client-server, not distributed computing
i. What is basic IPC mechanism / coordination mechanism in Pilot?
1. A: procedure calls, like Hydra
ii. Duality of OS Structures
1. Message Oriented: Unix, nucleus
a. Small, static # of processes, static communication paths
b. Limited direct sharing
c. Identification of address space with a process
d. Specific communication paths, e.g. sockets, channels, ports
e. Synchronization takes place with message queues
f. Shared data structures passed as messages
g. Priorities assigned at system level based on processes
h. Processes operate sequentially on a small number of requests
2. Procedure oriented: Hydra, Pilot
a. Large number of small processes
i. Easy, cheap – no communication channels needed to setup & tear down
b. Communication by direct sharing and interlocking
c. Context of execution identified with the function, not the process
d. Process has a single goal/task, but may wander all over the system as it calls procedures to do its work
e. Syncronization/queueing on data structures
f. Priority comes from execution context, not process
g.
iii.
1. Tightly coupled processes:
a. Procedure oriented: procedures operate on shared data with monitors
2. Loosely coupled:
a. Network sockets
i. Used for networked systems or local systems when services may be decoupled
ii. Predate berkeley unix sockets, similar organization
b. Abstractions – layers
i. QUESTION: What were layers?
ii. QUESTION: Why these layers?
1. Distinguish between hardware level & level visible to OS
2. Distinguish between datagrams and streams
iii. Layer 0 = link layer (e.g. Ethernet packet)
iv. Layer 1 =routing – global addresses
v. Layer 2 = services – reliable communication
c. NetworkStream Operations
i. QUESTION: What is the problem solving?
1. A: what is programming model for accepting connections & processing requests
ii. Listen – listen to an local port
iii. Accept – change caller to use a different port
1. Instead of using 4-tuple as in internet
2. COMMENT: missing virtualization here, so needed to change destination address
d. COMMENT: separates task of connecting from processing requests
iv. Language support
1. QUESTION: How did MESA impact OS?
2. Implement code as co-routines; similar to pipes in Unix
3. Programs consist of multiple processes
a. Cost of fork ~ 30x worse than a procedure call
4. Debugging support via world swap
a. Save whole system state to disk
v. Implementation
1. QUESTION: What is the problem solved in their organization?
a. How do you provide low-level services for the kernelÕs own use?
b. QUESTION: And what is the solution?
c. NOTE: paged kernel is new!!!
2. Manager / kernel pattern
a. Like policy/mechanism separation but no goals for multiple OSs
b. Low level kernel code allows basic facilities to be available to kernel code
i. Paging, swapping
ii. Handles frequent operations fast
iii. Handles operations within the working set
c. Higher-level manager provides extended services / special cases
i. Invokes kernel to do low-level work
ii. E.g. creating a process, mapping a file
3. Example: swapping caches
a. Cache mappings of pages to swap units for low-level swapping
b. If miss in cache, invoke high-level manager, which can be swapped out
c. Pin pages in cache needed to bring in manager
4. COMMENT: general concept here: caching
a. Cache common case for working set
b. Pin entries needed to bring in full code
c. COMMENT: for complexity, allows fast path do just do simple thing.
d. COMMENT: for complexity, allows higher-level code to rely on lower-level services
e. COMMENT: for performance, consult cache. Access full data structure only when necessary & for completeness/correctness
5. Example processes:
a. Bootstrap with one process
b. Kernel provides primitives for synchronization (wait, condition, notify)
c. Manager provides higher-level operations (fork)
i. Cooperating processes, global policies other than fairness
1. E.g. systems generally give better responsiveness to processes on top of the window stack
ii. No security
1. Just protection in language so canÕt accidentally trash other programs
i. Procedure-based communication, like hydra
ii. OS implemented in HLL to lowest level (interrupt delivery)
iii. Simple abstractions:
1. Spaces, streams, monitors, processes