CS 736 - Spring 2007 - Paper Discussion: The working set model for program behavior

« Exterminate All Operating System Abstractions | Main | The Duality of Memory and Communication in the Implementation of a Multiprocessor Operating System »

The working set model for program behavior

Peter J. Denning. The working set model for program behavior. In Communications of the ACM 11(5), May 1968.Pages 323 - 333.

Review for this or Mach memory paper due Tuesday, 2/20

Posted by Michael Swift on February 19, 2007 09:21 PM | Permalink

Comments

Summary:

Working set is presented as a better method to manage memory compared to LRU, FIFO or random while doing page replacement. The mathematical model of the working set technique simplifies the estimation of memory demands by a process.

Problem:

Resource allocation in a multi-user environment is either over simplified or too complicated. Better algorithms can be designed to improve system performance through better resource allocation.

Contribution:

The "working set" technique is a newly proposed one of the time. The mathematical model simplifies the problem of determining the accurate working set to the number of pages for the working set. This simplifies the complexity of the memory manager while using the working set algorithm which the authors claim to be better than LRU, FIFO and random page replacement policies. The model is fairly simple so it can be empirically verified whether most of the processes fit into it or not.

FLAWS:

The actual validation of the model is not done/shown whether most processes follow this (Intuitively it makes sense). The proof of superiority over LRU, FIFO and random page replacement policy is not presented. Comparison against LRU would be particularly interesting.

RELEVANCE:

Working set memory management of buffers is currently used by the database community.

Posted by: Archit Gupta | February 20, 2007 12:47 PM

Summary
This paper presents the idea of a "working set" which is the set of pages that has been recently used by a process. The authors view a "computation as being the fundamental activity in a computer system" and they tie together the demand for processing time with the demand for memory, defining processor demand and memory demand as the resources that need to be balanced in a multiplexing system.

Problem
The problem that this paper is attempting to address is the "absence of a general treatment of resource allocation" because of "the lack of an adequate model for program behavior." Prior to this paper, memory management was performed independently of information about processor behavior using schemes such as fifo and lru.

Contributions
The primary contributions of this paper are definition of the working set model, and the definition of memory demand as well as the notion of combining memory demand with processor demand into the notion of system demand and using this information to make better scheduling decisions that avoid thrashing.

Working set: This refers to the memory resources referenced recently by a process and can be used to estimate future behavior.

Memory demand: The portion of main memory that a processes recently accessed pages takes up.

Processor demand: an estimate of a processes demand for processor time in the near future.

System demand: The pair of memory demand and processor demand.

Flaws
It would have been really nice to see hard numbers as to how much this model improves actual performance. I believe that the concepts are solid but we can't know how this incarnation would perform until it is tested.

Relevance
I don't know how relevant the particular techniques that they use are today, but the perspective that they brought to management of memory resources is certainly relevant.

Posted by: Aaron Bryden | February 20, 2007 12:40 PM

SUMMARY

In "The Working Set Model for Program Behavior" Peter Denning presents a program
behavior model, based on the set of memory pages that a program uses. This model
is then used to describe a new resource allocation mechanism.

PROBLEM

Until this work, issues of process and memory management were developed
independently because there was no unifide model for them. Denning argues that
the two areas are in fact intimately related and should be considered together,
since their efficiencies are interdependent.

CONTRIBUTIONS

Model of program behavior that allows for general treatment of issues of resource
allocation that is based on what information (pages) are in use by a program.

Observation that a memory manager's goal should be minimization of "page traffic".

Observation that pre-fetching is a poor strategy with slow secondary memory
because of lack of realiable way of predicting system behavior as a whole and
consequent high risk of performing unnecessary fetches. From that, the problem
of memory management becomes that of deciding which page to evict.

A way of tuning the model for the underlying system (setting tau based on "memory
traverse" time)

Hardware and software-based memory management mechanisms

Recasting resource allocation problem into balancing processor and memory demands
against available resources.

FLAWS

This paper really would have benefitted from an evaluation. The flaws below
just elaborate on this.

Assumptions made in derivation of the model need to be tested to justify
the trade-off between model elegance and reality.

The final model didn't seem transparent enough to be intuitive (at least to me)

More examples of application of the model would be nice.

RELEVANCE

The paper presents a mathematical description of a problem, which is always
interesting. The general issue of resource management the paper discusses
will remain relevant for the foreseeable future

Posted by: Vladimir Brik | February 20, 2007 10:26 AM

Summary
The working set, an estimate based on run-time characteristics of a process, is the amount of memory needed for good execution times. Working set size and processor demand should be considered in scheduling decisions.

Problem
Scheduling did not balance the processor and memory demands of threads with the resources available.

Contributions
The paper starts out with a good summary of why predetermination of memory needs is a difficult task. The key realization of the paper is that a small set of pages are required for quick execution of a thread. As the number of assigned pages increase beyond the WS the returns decrease sharply. A tunable parameter tau is set as the time a page is kept in core memory if it is not accessed. Any page in memory for tau long without an access is available to be replaced. An impressive insight of the paper is the realization that a blocked process’s working set will likely be very different once the process resumes. The author recommends that processor demand and WS should be balanced across scheduling and that a process should not be run if it’s WS amount of memory is not available.

Flaws
The assumption of independently identically distributed variables could have benefited from some more justification. From a very abstract level I can see how memory could be viewed as iid., but at a low level a program seems to be anything but iid. The next action taken is a direct consequence of the previous steps. It seems that during different parts of the execution the memory requirements could change drastically or it is likely many programs could display non-iid behavior.
I felt there was not enough on balancing demands. What is to be done when a WS grows enough that there is no longer enough space in core memory? What value is used for the WS of a process that hasn’t begun execution yet, or is expected to change its WS?

Relevance
In today’s world of increasingly quick processors and multiple processor memory is becoming an increasing bottleneck. L1 cache is still very limited and good algorithms are needed to ensure that memory is used efficiently.

Posted by: Kevin Springborn | February 20, 2007 06:18 AM

Summary:
In this paper, the author proposes a “working set model”, which enables the operating system to observe and manage both memory and processor allocation balanced without getting advises from program or compilers. Working set of pages is a minimum collection of pages that a program references in arbitrary time-length to execute efficiently.

Problem:
Since the computer were being used for multiple tasks and programs at the same time, it was important to have a efficient resource allocation algorithm.
The model that could grasp the state of both processor and memory demand is needed to implement an efficient algorithm.
To be a true general purpose system, operating systems should not rely on advices from the program or compiler, and should be able to manage resource allocation decision by themselves.

Contributions:
Working Set model enabled efficient (less page traffic) dynamic resource allocation management based on “system demand” which contains demand based on the unit of processes. Comparing to previous methods such as memory management based on each pages separately and statically.
Even though they manage resource allocation as system demand, they describe a vision of both memory allocation and processor scheduling. Then suggests the priority of “balance memory, then balance processor” since the bottle neck of the performance could be “Thrashing”, which happens when there are not enough resource for even the minimum working sets of running processes.

Flaws:
Since the working set model requires managing a time based on each process, adding working set model into the real system or algorithm is expensive and very difficult.
Tracking the state of process and using a correct value of timer is very important, but the process’ state will change and it is really difficult to find out an efficient balance.
They introduce a hardware implementation of timer, but since it requires an additional hardware on system and that was unusual, it was unrealistic.

Relevance:
Even though the working set algorithm is difficult to implement on the system’s algorithm perfectly, the perspective of its model and purpose of it was stimulating idea at that time. We have an algorithm such as “aging” which provides acceptable performance and very low cost these days but still working set is an important perspective for page swapping algorithm.

Posted by: Hidetoshi Tokuda | February 20, 2007 12:04 AM

Summary:
Denning describes what he calls the working set model, which is a view of unified memory and CPU allocation with the goal to minimize paging traffic to/from secondary storage.

Problem:
The problem was how to effectively allocate resources to processes while not making the allocation algorithm itself unwieldy large.

Contributions:
* A detailed, convincing model hypothesizing how programs behave with regards to memory demands, including some clever observations such as a program blocking on I/O is a possible tell the working set is about to change
* A description of how you could use past program behavior to predict future behavior in an effort to effectively allocate resources, and how different choices of parameters such as time intervals would affect the outcome
* A somewhat high-level description of a "scheduler" that would make decisions about memory and CPU together that runs reasonably fast
* The idea that a reasonable measure of the effectiveness of these algorithms is the paging traffic between core and secondary storage

Flaws:
We're back to early papers without much evaluation. Denning develops all these equations that are *supposed to* model how the system would behave, and it all sounds reasonable, but there's no validation. This paper seems to me to want an evaluation even more than the other papers we've read.

Also, his ideas for implementing the paging mechanisms in hardware seems like it could lead to less flexible designs if someone came up with a better scheduling algorithm. (At best, if you ignore the mechanism than the hardware is unnecessarily complicated; at worst, it would do things you don't want it to or not do what you want.)

Relevance:
My understanding is that current OSs still try to detect programs' working sets and optimize their memory allocation accordingly. This shouldn't be too surprising because there doesn't seem like there should be any compelling reason why the typical program would have a different memory behavior than what's described by Denning.

Posted by: Evan Driscoll | February 19, 2007 11:47 PM

Paper Review: The Working Set Model for Program Behavior [Denning]

Summary:

This paper presents a model for program behavior based on the notion of
the "working set", i.e. the set of pages recently used by a process.
The author then defines the memory demand of a process, and combined
with the process' processor demand, propose how to a balanced system by
integrating scheduling and dynamic management of virtual memory.

Problem:

The problem addressed is to formally define a memory management
strategy, basically to decide which pages should occupy main memory,
that outperforms both overly simple and overly complicated prior
strategies.

Contributions:

A virtual memory management strategy is presented that is feasible to
implement within the operatin system, such that it discovers and acts
upon process behavior, without requiring process behavior cues in
advance from compiler, user, or system administor.

The working set strategy seems an intelligent advance over over random,
FIFO, LRU, and exotic/code-aware methods. Also, the work proposes a
simple, efficient implementation to keep the record keeping involved
from being too much work. For instance, just working set size, rather
than further knowledge about the page set itself was found sufficient.

Working set memory management avoid thrashing by trying to get, and
retain, the correct set of pages in memory for a process to run
efficiently, without having to block unnecessarily for page faults, and
for the system overall to not saturate the communcation between
secondary page/swap storage and main memory.

Flaws:

I found it interesting that the memory management "checker" process was
scheduled amongst normal processes in the run queue. I don't see why it
should run less frequently as the run queue lengthens. It seems that
the frequency (to consider paging operations) should be a function of
memory demand, which is not necessarily the same as run queue length.

Relevance:

This work is very relevant. The notion of working set size and/or
resident set size is a useful metric for system performance evaluation
and sizing, for instance with the Unix top command. As the author
propose, this working set technique works well when the physical memory
is right sized to accommodate the process load's total working sets.

Posted by: Dave Plonka | February 19, 2007 11:39 PM

CS 736 - Spring 2007 - Paper Discussion

The working set model for program behavior

Comments

Post a comment