Summary of complexity approaches:

Organization:
- layering: simplifies dependencies, but inflexible
- hierarchy: MCS
- microkernel + protected subsystems: MCS, Hydra: flexible but slow
- process oriented: Unix, MCS
- procedure oriented: THE, Hydra, Pilot

Tools:
- layers of indirection (e.g. memory in THE, capabilities, directories
in Unix, redirection in Unix)
- Refactor abstractions: MCS processes, protection domains, hydra
etc., Pilot FS
- Policy / Mechanism separation
- Kernel / Manager patterns
- common abstractions: objects( hydra), capabilities, files (unix)
- fine grained protection
- rights amplification
- hardware protection
- type safety


Problem at hand: how do we adapt OS behavior to application demand?
Approach 1: make OS smarter - detect some pattern (common case) it can
optimze for. Add general purpose code, e.g. new system calls
Approach 2: allow OS to be extended: apps can direct OS what to do

Requirements:
- untrusted code
- retains trustworthy system

High level approach:
- allow code in kernel
- break kernel activities into fine-grained parts that may be of
interest
- allow apps to register handlers for those pieces to do something

Problems to solve:
- how do you make sure the code doesn't crash the system?
- how do you make sure the code only applies to the correct user /
process?
- How do you make sure the code doesn't leak private information?
- What are the right events to expose?
- What happens when multiple things want to handle an event
- How do you make this fast?



Spin ideas:
1. Implement kernel + extensions in Modula 3 (like Java)
- certifying compiler digitially signs binaries to make sure they
haven't been tampered with.
- type safety prevents crashes, security breaches

2. Capabilities
- treat pointers to kernel objects as capabilities. Can't be forged
(but don't have permission info)

3. Name-based protection domains
- protection domains != a set of capabilties, but a set of types
(e.g. code that can be run)
- can register auth. procedure to determine whether linking is allowed
- NOTE: does not use capability bits
- NOTE: can't extend something if you can't name it or link to it.
EXAMPLE: limit who can do packet filtering by saying who can access
network name space. Overall, not well motivated.
EXAMPLE: mechanism limits you to linking to specific interfaces, not
overriding kernel mechanisms that are global. E.g. kernel scheduler

OVERALL: do protection early. In the compiler, at link time, so that
at runtime nothing need be done. For example, check authorization at
link time.

4. Extension model
- Goal: complete flexibilty between observation, hinting, replacing
service
- Model: events + handlers
- event = procedure exported from an interface
- handlers = procedures with same type
- e.g. console.print is an event when somebody calls it
- Extensions supply additional handlers for an event
- can be multiple
- can replace original

EXAMPLE:
- page fault is an event
- a handler can decide how to handle it
EXAMPLE: page to network instead of disk
- replace handler for memory swapping - what to do when a page is
reclaimed or accessed

5. Authorization model
- event provider (module that defines the type) can provide
authorization procedure to determine whether handler is allowed
- provider supplies "guard" - a predicate - that tests whether handler
is to run
- example: pid = 4
- socket = 23
- handlers can be synchronous or async, in an order, bounded time
- can execute multiple + merge results

EXAMPLE:
- page fault handler may limit what faults you can handle - guard
ensure only those from your process
- packet filter ensure you only see packets for your process

SECURITY:
- bounded time limits resources
- guards can make sure that only extension for a process run, in that
process scheduling domain, so it is charged for extensions -- doesn't
impact whole system

SO: WHO CARES? WHAT CAN YOU DO?

Key result of extensibility: can't extend arbitrary code. What if
scheduling routine has 1 event: schedule(), no get_next_process()?

Core services: what events are defined that you can extend?
Note: is a hard problem. Need right granularity to be relevant but not
too fined-grained, not too coarse grained.

General model
- define an interface for an object
- contains two parts:
- "Raised" interface used to request service of the
object. E.g. block a thread, unblock a thread, allocate a page
- "Handled" interface used to implement pieces: reclaimPage,
checkpoint, resume -- these are handled by extensions
- some are both: pageNotPresent exceptions


Extensible Memory:
------------------
- 3 components: physical storage, naming, translation

Physical address service:
- allocate() to get memory, returns physical unmapped page
- can call Reclaim event on the page to get it back, allows return of
an alternate page

VA service:
- allocate a range of virtual addresses in an address space

Translation service:
- install a mapping between phys and virt
- events: badAddress on unused VA, notPresent of used but swapped VA

EXAMPLE Uses:
- COW: install read-only translation, on fault request new page +
translation
- Fork: similar
- memmap files: request page, read data from file on notPresent

Thread Memgt
-----------
2 parts: system scheduler + thread package
Scheduler invokes thread packager, which provides implementation of
"strand" interface
- strand abstraction, can block + unblock (LIKE HYDRA!!!)
- block/unlbock go to scheduler
- scheduler calls checkpoint / restore in thread package

- NOTE: doesn't save state by default. WHY?
- what state is saved on switch?

Thread package provides checkpoint + resume
- decides what state to save/ restore, how to store thread ID
- can do thread switch outside kernel

Distinction between checkpoint / resume + block/unblock:
- block/unblock affect thread state - is it on runqueue or blocked
queue
- checkpoint/resume affect whether it is actually running: resume
makes a thread run, checkpoint makes it stop running

HOW USE:
- when global scheduler calls resume on app thread, it can choose how
to handle it; which thread state to resume.
- when checkpoint called, app. scheduler must give up CPU to global
sched or OS. Can update local structures to indicate this.
EXAMPLE: user-level threading
- on checkpoint, store thread state to thread
- on resume, jump right to scheduler to allow it to make a choice of
what thread to run.

TRUST ISSUES
-------------
- services that mediate access to an external resouce (e.g. HW) must
be trusted to do it correctly - outside language model
- extensions only impact applications they are associated with;
failures only impact that application (in absence of global
synchronization)


EVALUATION ISSUES:
-------------
- What do you evaluation?
- What do you compare it against?
- Micro. vs. Macro benchmarks?
- interpreting results: what do they mean?

Comments from reviews:
----------------------
- semantics for event handlers? How do you make them do the right
thing?
- DOS attacks with lost of handlers?
- can you trust the signature of a compiler? Can it be forged?
- unclear if needed
- will language security be bypassed? needs root access...
- what is the security risk of a malicous extension?
- performance bottlneck - can you get a backlog? scheduler can make
sure it applies only to a single process