Scale and Performance in a Distributed File System.
John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Stayanayanan, Robert N. Sidebotham, and Michael J. West. Scale and Performance in a Distributed File System. ACM Trans. on Computer Systems 6(1), February 1988, pp. 51-81.
Reviews for this or NFS due Thursday, 3/29.
Comments
Summary:
This paper describes and analyzes improvements of AFS by redesigning a new version of AFS focusing on scalability of the prototype. New version improved the scalability by a system design with less traffic and robust communication/process/data management.
Problem Addressed:
The prototype worked well with a laboratory/department scale (100 workstations with 6 servers). But the AFS was proposed for more large scale usage, and from the observation, several issues such as cache/communication efficiency, load balancing, and process management.
Contributions:
First of all, the way that authors improved the AFS based on observation on real world and well organized analytics seemed to be powerful and efficient. Experiment on real world requires a lot of work and equipments, but it became very useful with well focused perspectives of analytics.
Caching the whole hierarchy of files instead of only files and also the callback mechanism improved the speed to access file with less the traffic (simplified communication and state management).
Name resolution mechanisms were mainly moved into the clients using fid to reduce the server workloads as possible to improve scalability. “Volume” also made easier to organize data distribution in/between servers which probably will solve the workload and load balancing issue.
Also by using Lightweight Processes (threads) in one process, they reduced context switching and improved the efficiency and flexibility of allocating workloads.
Possible Improvements:
The way they moved many workloads and state management from server to client improved the performance. But on the other hand, it increased the requirements for the client from the perspective of storage/memory size and perhaps, reliability. Also the way they cache data in their local storage will create difficulty on managing copies on several locations. If many users share one workstation, there might be some difficulty of handling the caches in the local storage. Across user cache sharing might also be interesting.
Performance:
The scalability of the AFS were improved by redesigning the architecture and functions to organize the work distribution from server to client. The improvements were clearly stated by well focused key aspects and comparing between NFS. The goal is to create campus scale file system and the requirements might slightly change since the variety of applications, workload distribution on time, and many other future works could be found.
Posted by: Hidetoshi Tokuda | March 29, 2007 09:12 AM
Summary
This paper describes the early AFS implementation, and then goes on to discuss the scalability of AFS and the ways in which the authors developed benchmarks to evaluate the performance of the system. The benchmarks are then used to test improvements that would allow AFS to avoid significant server loads as the number of users increases.
Problem
The overarching problem of the existing AFS implementation was that it didn't scale. The scalability problems appear to have mostly originated from unneeded client/server communication. The client/server communication mechanism was also problematic in that it relied upon RPC which apparently clogged the networking portion of the kernel, and caused unneeded context switching.
Contributions
Using FIDs to map inodes and avoid server namei lookups was a great idea. The namei lookups were apparently a large reason behind the large CPU loads on servers.
Callbacks reduced the network load by eliminating many unneeded cache validation requests. The callback idea is rather simple: when a client caches something, the server informs the client that notification will be given if any other client attempts to cache the object being used.
Volumes were another great idea. Simply organize groups of files into volumes, and you can load balance by moving volumes around. Volumes also allow for an easy quota implementation.
Flaws
The performance of a DBMS would likely suffer under AFS due to the transparent file distribution.
As pointed out in the paper, servers must now keep more state. For large numbers of clients, performance will potentially degrade.
There are no optimizations presented in the paper that would help mitigate network congestion.
Performance/Relevance
The improvements in the benchmarks speak for themselves. By examining the source of the original implementation's bottlenecks, the authors have been able to implement a system that scales decently well in real-world use.
Posted by: Jon Beavers | March 29, 2007 08:53 AM
Summary
The prototype Andrew File System was analyzed and improved by reworking what is cached, improved communication efficiency, and introducing volumes.
Problem
The prototype worked well, but it was CPU bound and did not scale well enough to meet the project goals.
Contributions
The Andrew File System is a distributed file system that relies on caching files locally on clients for good performance. Files are synchronized with the server at the granularity of open and close. This article lists a number of performance improvements added to the AFS prototype server.
The first major improvement introduced the idea of callbacks. The idea of callbacks is the server will notify clients when files are changed, this way clients do not need to validate their local copy every time they use it, because the clients will get a notification update when it changes.
Communication efficiency was improved on the client side by using unique ids the remove the need for namei lookups. On the server side lightweight processes replaced the one thread per client model. With the lightweight processes much of the context switching overhead was reduced.
Finally volumes allow for easy mount to server resolution and help with creating unique ids for files. Volumes also add the ability of quotas, replicating read only information for faster access, and nearly on-line backups.
Possible Improvements
It seemed many of the fine details of the system were not covered. For example callbacks must be a more complicated mechanism than they sounded in the article. The consistency problems are complex. In the case that a server has to "drop" a callback there must be some sort of mechanism to inform the clients they are being dropped. In the case where a callback is lost for some reason (network failure, packet corruption), does the server continue to resend the callback until it is ACKed? What happens if the client modifies the target of a file just before a callback is received? How are the two separate changes merged?
With large multimedia files today it would be interesting to see how the system would respond. A large file might take some time to transfer, even over a fairly quick network. Is a user blocked until the entire file arrives? Can AFS handle large file burstyness?
Performance
Scale-out is the performance factor targeted by the Andrew File System. Their stated goal is to scale server performance to 5000 workstations.
Posted by: Kevin Springborn | March 29, 2007 06:49 AM
Summary
In this paper, AFS is discussed from scalability point of view. Initially, various studies based on the prototype implementation of AFS are presented and based on these studies, various improvements to AFS cache management, name resolution, communication and server process structure, and low level storage representation are analyzed.
Problem Description
AFS is mainly motivated by the fact that the performance of contemporary file systems does not scale well with the increase in the number of clients. While designing AFS, the authors have mainly focused on improving the scalability aspect of the file system.
Contributions
The authors present a file system that scales well with increase in clients/load. This feature is quite important in a distributed environment where a file system is required to handle significantly larger load as compared to single user environment.
The authors have followed a software engineering influenced approach for analysis purposes which is quite nice. The authors initially implemented the prototype of their file system and conducted various studies on it. Based on these studies, the authors proposed various improvements to the file system like enhancing performance and scalability by caching the contents of directories and symbolic links in addition to files, using an implicit namei operation on the server to locate a file, reducing communication between systems by using call-back etc. All these improvements were possible only because of the fact that the authors analyzed and studied the prototype of the file system before implementing the entire thing.
The call-back mechanism presented in the paper is also quite interesting. This feature reduces the communication required to validate the existence of a remote file. Using call-back, this validation is not required each time the file is opened. Whenever the file having a call-back is modified or deleted, the system using that file is notified. This feature improves the scalability of the system.
Flaws
Overall the system proposed and the methodology used to validate and improve it is quite impressive. The only flaw that might cause some problem is the extensive use of the cache. Remote files are cached by clients in order to improve performance and scalability. As a result, a major portion of the physical memory is used to store files. This can result is performance degradation of other applications due to shortage of memory.
Performance
The main goal of this paper is to present a file system with improved scalability as the number of clients is increased. In order to achieve this goal, the authors initially studies the prototype of their file system and based on those studies made design decisions to improve the scalability of their system.
Posted by: Atif Hashmi | March 29, 2007 04:58 AM
Summary:
This paper details work done on the development of the Andrew File System(AFS) that supports a large number of clients in a transparent nature without the need to modify applications. Problems with an initial prototype are explored and addressed in the actual file system implementation.
Problems Addressed:
Several issues were identified in the prototype system that limited the performance and usefulness of the system. These include many server checks to maintain cache consistency across multiple clients, many calls to a processor intensive routine namei , and overhead associated with context switching between processes on the server servicing client requests. All these issues reduced the scale out and number of clients that could be concurrently connected to the AFS server.
Contributions:
To reduce the amount of context switching required processes were replaced by threads all running in a single process. Instead of allocating a single process for each client a thread would service a request and then return to a pool of free threads ready to service another request, the threads were not tied to a client for the duration of the session. Extending the cache structure of the system proved to be the key to enabling more clients to use a single server. Workstations would cache entire files so that network traffic could be reduced servicing many small requests for parts of files. Directory contents and links in addition to files were cached further reducing the amount of network traffic. Path-name translation was done by the clients and would present servers with fids which uniquely identify files and reduce the workload for the servers.
Flaw:
Since the system caches entire files disks need to be large enough to hold the largest file being accessed. Its not clear how a database application could then run on a server mounting a database file over the network since these files could be extremely large. It seems like more network traffic then necessary could be induced if an application accessed small portions of many files.
Performance:
The primary component of performance addressed in this work is that of scale out. Most of the work centered around reducing the amount of work the server had to do to service any one client so that many clients could use the same server.
Posted by: Nuri Eady | March 28, 2007 10:40 PM
Paper Review: Scale and Performance in a Distributed File System [Howard, et. al.]
Summary:
This paper presents the Andrew File System (AFS), a distributed file
system built with a set of servers and intended to scale of thousands of
clients. The technique used was to intercept open and close file
operations and use them to populate and flush a local cache of whole
files (on a local disk) so that remote file access had approximately the
performance of local files for reads and writes.
Problem:
The problem was that there was not a robust distributed file system that
could scale to thousands of users/client machines, to perform the sort
of work that was typical of university shell users of a Unix system.
Contributions:
* The design of AFS was guided by performance measurements of a prototype.
Significant contributions were:
(1) avoiding unnecessary namei() (pathname to inode translations)
by introducing a file ID (Fid) on both the client and server side;
(2) avoiding man client operations to determine cache consistency
by introducing "callbacks" in which the server keeps client
state and notifies the client of events.
* The notion of "volumes" aids configuration flexibility by allowing
user home directories (and other dirs) to be migrated amongst servers
for maintenance and load balancing. It also help for many other
useful features for the target environment such as piecewise backups
and quotas.
* The ability to retain cached files even across reboots of clients seems particularly useful to reduce load when users have their own
home dirs and have an affinity for using particular workstations.
Flaws:
* There is a common use case with NFS when applications can do most of
the file manipulation on the NFS server, and clients do light or
read-only operations. It wasn't clear whether AFS also intercepts calls when files are modified on the server itself. This would be
necessary to cause notifications to be generated to the clients when
the files changed.
* The authors state that no callbacks are necessary for read-only
volumes. Wouldn't callbacks be required so that servers can actively
revoke read-only volumes from clients that have them mounted, to periodically update their content?
Performance Relevance:
The performance provided by AFS is particularly relevant for scaling-out
to large numbers of servers and clients. Although this depends on thesize of files and the sort of access locality of the users, at its best, AFS provides performance on par with that of local files.
Posted by: Dave Plonka | March 28, 2007 03:50 PM