CS 736 – Spring 2006

Mach

Presenter for next paper
Student presentation
Context

Accent, etc
QUESTION: What does this have to do with complexity?

Goals

Multiprocessor
Unix compatibility

i. One of first OS designed to be compatible with another one!

Message based (not procedure oriented)
Multiprocessor – capable
Network – capable
QUESTION: Why discuss accent so much?

Mach overview

Mach abstractions

i. Task = execution environment / address space / unit of resource allocation

ii. Thread = unit of CPU utilization

iii. Port == communication channel, a queue for messages protected by capabilities

1. Are basically capabilities you invoke by sending a message instead of dereferencing

2. Q: How do you get a port? From a name server or from your parent

iv. Message = typed collection of data, may contain ports

v. Memory object = collection of data provided for and managed by a server that can be mapped into an address space

Operations on objects

i. For everything but messages, implemented by sending/receiving messages

ii. Indirection of messages allows a network to be interposed, either an SMP or a cluster / distributed system

iii. Integrated VM and IPC reduces performance overhead of IPC compared to shared memory ; need not copy

Mem operations

- address maps: sorted linked lists of map entries, each describing a

region, per task: protection + inheritance. Used for PF lookups,

copy/prot operations, allocation/deallocation of address ranges

- memory objects: units of backing storage: specifies resident pages +

where to find non-resident pages. Can be stored outside kernel

- Shadow objects shadow a memory object and contain COW pages

- share maps for explicitly shared memory (not COW) == layer of

indirection for an address map

- resident page table: current attribute for all physical pages

- pmaps: subset of pages visible to HW - can be thrown away any time

for efficiency or space. Is a cache of real info, can be

reconstructed at any time.

Machine independent structures: not tied to hardware table layout:

inverted, normal, hierarchical, virtual, software TLB

Machine dependent: a subset of information just for hw use.

Physical memory == cache of all memory objects.

i. Allocate / deallocate

ii. Set protection / inheritance status

iii. Create & manage a memory object for other tasks

iv. Optimizations

1. Read/write sharing and COW sharing

a. Whole address space can be sent with no copying!

b. E.g. used for Unix FORK

c. Implemented with shadow map that specifies real map to receive page from on fault

v. Protection

1. Can set current protection – in use by hardware, and maximum – limit to which it can be lowered (e.g. prevent making it writable)

vi. Processor independent

1. BIG IDEA: caching

2. Maintain cache of processor-dependant translations in PMAP; go to independent structures on PMAP fault

3. PMAP corresponds to hardware TLB or page table

Memory / communication

Goal: make communication fast by using memory
QUESTION: How?

i. Make it easy to send large-objects

ii. Only copy data when necessary; otherwise re-use same data via sharing

iii. Allow external sources to manage data

iv. QUESTION: How easy is this to use? A: have to pack data onto a single page; still have to marshall/unmarshall. Mapping address may be different in different address spaces.

High-level structure

i. AS contains memory regions (ranges of addresses that are mapped to something)

ii. Mach flexible controls what they are mapped to for efficient read/cow/rw sharing

iii. External pagers for backing pages

1. Memory object represents a data object obtained from an external pager

External pagers

What are they?

i. BIG Abstraction:

1. Kernel maintains in-memory cache of an object

2. Kernel invokes pager when it moves things in/out of cache

3. Pager invokes kernel when things are unavailable

ii. COMMENT: Think caching

1. Kernel is simple cache for data

2. Complexity handled by pager

iii. COMMENT: think layer of indirection

1. Kernel provides layer between program & pager

2. Kernel makes pager data available in address space

iv. Provides initial data for memory object

v. Controls access to memory object (e.g. when can you r/w)

vi. Provides backing for memory object (e.g. when it is evicted)

vii. Interface:

- vm_allocate_with_pager creates one in task at an address. Called by

an application, memory object specifies the pager

- kernel to dm interface: (async)

- init - init a mem objc

- data_request - request data be filled in

- data_write - write back data

- data_unlock - unlock data - on a permission fault

- data_create -

- dm callbacks to kernel:

- data_provided; supply memory contents

- lock: restricts access to a page - e.g. read onl

- flush: invalidates cache, may writeback, kick from cache

- clean_request: force data writeback, but can keep in cache

- cache: kernel should keep objects around if not in use

(e.g. program will be run again soon).

- data_unavailable: notify that no mem available

Note: decoupling of data_request and data_provided; can return more

data than requested (e.g. prefetching)

Benefits:

i. Most of kernel memory is treated as a cache – transparent mixing of file cache with VM system allow a larger file cache (Unix used just 10% at the time)

ii. Fast access to large shared objects - e.g. shared array access

How are they used?

i. File system with whole-file acecss

1. Model: file system server process + FS DM

2. File APIs RPC to FS server

a. open file: RPC to FS server to create memory region, returns a COW of the region

3. Memory access to file

a. page fault causes pagee_data_request to FS DM

b. FS DM calls disk to get data, provides data to kernel for

c. Kernel creates COW for client of the page

4. When closed, can flush back to disk (not shown in example)

ii. Consistent shared memory

1. Idea: allow processes on different systems to share memory

2. Approach:

a. Have a server responsible for a page

b. Ask that server for the page

c. It provdes it to as many readers as want it

d. When get a call to change protection (pager_data_unlock), flushes page from other systems, THEN updates local protection

iii. Process migration

1. Can move processes to other systems for load balancing

2. Use consistent shared memory to fault pages over as accessed

iv. Transactions

1. Can allow DB to have control over paging of data

2. Can provide transactional memory; by logging writes before updating structures on disk

v. Idea: easy to implement things like this

Problems:

i. What if pager doesn’t respond?

1. A: have default pager that flushes pages to disk

2. Kernel knows about default pager, calls it when other pager fails. Are not multiple default pagers.

ii. What if share data from untrusted pager with trusted process? Or deadlocks by backing its own data?

1. QUESTION: how relate to hydra? Negotiation?

2. Need to have timeouts on how long a thread will wait for memory to arrive

3. Need to have other thread handle message about timeouts

4. Raises complexity;

5. QUESTION: what alternatives? Could ask message system to ensure memory in cache?

QUESTION: What is the cost?

i. Overhead of calling to usermode

ii. Trusted third parties

Big picture

i. Allow memory to be used for communication, not just local storage

ii. Provide interface for external pagers to get involved on important decisions; where data comes from, invalidating data

iii. Efficient communication by sharing memory

iv. Treat kernel as a cache for data from other places; like kernel-managers in pilot

ISSUES:

Was Mach successful? Pretty much the only research OS to see commercial use
Supporting multiple OS never worked well; too hard to be compatible with MS OS
Cost of IPC too high; unix server moved into kernel
MacOS

i. Mach for IPC, process & thread management, memory management, hardware abstraction