« TreadMarks: Shared Memory Computing on Networks of Workstations | Main | Autopilot: Automatic Data Center Management »

Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks

Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks. 
Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly 
European Conference on Computer Systems (EuroSys), Lisbon, Portugal, 
March 21-23, 2007

Reviews due Tuesday, April 12th.

Comments

Summary: Dryad is a runtime for data-intensive parallel programs that models computations as graphs where vertices are sequential programs and edges represent data flows between programs. Developers use a high-level language to construct graphs representing their computations, decreasing the need to directly use low-level synchronization primitives.

Problem: The rise in multi-core systems has increased the need for software to compute in parallel on separate processors where possible, but developing such software is often difficult when working directly with low-level parallelization primitives such as threads and methods to communicate between them. In response to the demand for easier programming models, systems such as MapReduce were developed to handle certain common paradigms. But systems such as MapReduce and parallel database servers are still very domain-specific and leave limited room for customization. There is a conceptual level of parallel communication that is more general than these specific systems, but above the level of communication primitives that operating systems provide. Without an implementation at this higher level of abstraction, each developer of a framework like MapReduce must re-implement the same error-prone communication logic.

Contributions: Dryad provides a general platform that developers of parallel computation frameworks such as MapReduce or parallel databases could use to implement communication between processes without worrying about low-level synchronization primitives. Instead of expressing a task in a single program that uses such low-level primitives, developers write several sequential programs and provide them to Dryad. Dryad then allows developers to relate these programs to each other using a high-level language. This language also allows jobs to scale from single multi-core machines to large clusters transparently to the developer. A key aspect of this approach is that developers can easily experiment with different configurations of this communication graph to find one that most efficiently executes their programs. Dryad has been demonstrated to handle both computations in the parallel database and map-reduce paradigms, and to scale up to large clusters.

Flaws: The generalizability of Dryad is not clear from the examples provided, as both the SQL example and the data processing example could have been handled under existing paradigms. If Dryad is supposed to handle a wider variety of paradigms than existing systems, it would be desirable to see a more concrete example of Dryad's flexibility. For example, the authors could have presented a Dryad graph that could not be expressed in any other framework including MapReduce or SQL Server. Furthermore, for the data processing example that was amenable to map-reduce, no performance comparison was provided against an existing map-reduce implementation.

Applicability: Dryad is not directly applicable to individual developers writing specific parallel programs, but rather to creators of still higher-level frameworks that would actually be used by application developers. Developers writing one-off applications would probably not use Dryad directly because the process of configuring the communication graph for a job. Fine-tuning the communication graph can involve a high degree of manual experimentation. One aspect of Dryad that is not clear is whether higher-level frameworks built on Dryad would also require manual fine-tuning of the communication graph. Frameworks would have to make a trade-off between exposing too much configuration complexity to the user and limiting possible performance.

Summary

This paper presents Dryad from Microsoft, a general purpose distributed system
for coarse-grain data parallel applications, based on simple graph
abstractions. A dryad application has a set of vertices for computation, and a
set of channels connecting vertices for communication. High parallelism comes from
the independence of vertices.

Problem

It is difficult to develop and deploy distributed and parallel programs, because
application developers should write verbose and error-prone code to deal with
issues of resource scheduling, hardware/software partial failure, crash
recovery, and so forth.

A uniformed, simple, and expressive abstraction is desirable for such application domain, as well as an efficient runtime system.

Contribution

This paper has no any substantially novel idea for each piece of their
work. However, as other systems (e.g., Dynamo), the combination of existing
techniques does build something interesting.

- Abstraction: Dryad adopts a clean abstraction to expose data parallelism to application
developers. The idea of forcing application developers to consider data
parallelism is borrowed from GPU shader languages, MapReduce, and parallel
databases (parallel execution of SQL), however, Dryad stands at a different
level of abstraction from the three aforementioned languages. Dryad has the
lowest level of abstraction, which can be used to implemented GPU shader
languages (e.g., process each pixel in a vertex, and feed the output of a
vertex to other vertex via a channel for further processing). MapReduce can be
implemented using GPU shader languages (google "gpu mapreduce" for more info),
while SQL can be implemented using MapReduce (e.g., table scan in map, and
aggregation in reduce). In short, from lowest level of abstraction to the
highest: Dryad, GPU Shader languages, MapReduce, SQL.

- Graph description language: This looks like Data Definition Language in SQL,
for creating and modifying schema for tables, before doing "real computation",
like query processing. They propose some graph operators, in which, the merge
operator seems to exist in parallel database.

- Runtime System: Dryad adopts a master-slave design for the runtime system (as
GFS, and MapReduce), does pipeline execution (as in parallel database), and
does run-time graph refinement (as query optimization in databases) ...

Flaw

As stated in the Problem section in this review, programming distributed and
parallel application is difficult. From a programmer's point of view, debugging
and performance profiling on distributed systems are important, while few
literatures cover such topic, as in this paper.

Applicability

We can use Dryad to build distributed GPU shader languages, Microsoft MapReduce,
and Parallel Microsoft SQL Server 2012 ...

Furthermore, Microsoft can market the Dryad-as-a-service cloud computing, in
addition to its Azure.

Summary:
This paper gives an overview of the Dryad distributed computing framework, focusing principally on its ability to use any user-specified DAG to structure its computation, while also discussing its programming model and some of its run-time characteristics.

Problem:
Distributed and parallel applications can be difficult to write; in order to expedite this process, frameworks have been developed that handle much of the difficult background processes that a distributed system requires (load balancing, replication, scheduling, etc.). Dryad seeks to provide one of these frameworks, while allowing for a greater degree of flexibility than many other alternatives (e.g., MapReduce, Hadoop) provide.

Contributions:
Dryad’s primary contribution in this paper is its ability to work with any user-specified DAG, rather than a predefined, MapReduce-like architecture, allowing Dryad to handle a variety of problems within its single framework. To enable this, Dryad provides a robust graph specification language and is capable of making run-time optimizations to graphs to boost overall performance and conserve bandwidth. In addition, Dryad also allows the user to specify how nodes communicate, allowing for TCP pipes and shared-memory FIFO queues in addition to sharing data through the file system.

Flaws:
While the paper goes into significant detail about how the user specifies the DAG that the program will use, it provides little information about how the job manager actually handles vertex scheduling; it is unclear whether the job manager takes data locality into account on its own, or if the only way to do so is to manually specify the machines on which vertices operate. On a broader note, as others have noted, the paper does not really give a strong idea of how useful the framework’s flexibility is. The additional overhead of having to define a DAG for many tasks adds more room for bugs and may dissuade users from taking advantage of it (much of this is mitigated, however, by the premade classes mentioned in the paper, though it appears that the user would still have to configure them).

In addition, while I appreciated the comparison to SQLServer’s performance, I would have liked to have seen a comparison of the data mining task to some other framework, like Hadoop. Currently, there is really no sense of how the optimizations described in the paper benefit this sort of task; though it may have been prohibitively difficult to set up a Hadoop framework for testing, some comparison to existing data, at least, would have helped provide a better sense of Dryad’s efficacy.

Applicability:
As the paper mentions, the entire system seems very similar to the way Condor operates, although it works in a cluster environment. Dryad would be useful in environments that frequently deal with distributed problems that do not fit easily into a standard MapReduce framework, either because they require a different architecture or they work better with fewer nodes, as well as executing batch jobs on multicore machines (MapReduce focuses on large clusters and would not be appropriate here). The system seems promising as a whole, though I would like to see further studies that compare it with the operation of more specialized frameworks.

Summary

This paper describes Dryad, a framework for creating distributed systems based on a stream-processing programming model with processing nodes as vertexes and communication channels as edges in a directed acyclic graph. Because edges can be temporary files, TCP channels, or in-memory FIFOs, the Dryad model is scalable from multi-core CPUs on a single machine to compute clusters consisting of thousands of servers.

Problem

There are several problems which are common to many different distributed systems, including scheduling computation, managing communication, dealing with failues, and safely handling synchronization issues. A large class of distributed systems achieve good performance scaling by exploiting data parallelism, and some existing frameworks (particularly MapReduce) are designed to ease the development of such data-parallel applications. The MapReduce programming model is restricted to 2-phase computation (the Map and Reduce steps, possibly with intermediate sorting and/or merging), however, and uses temporary files in a distributed filesystem as the only means of communication between processes. Opportunities exist for allowing a more general model of programming and communication, and for other methods of communication which better exploit temporal and spatial locality.

Contribution

Dryad allows the programmer to specify an arbitrary DAG with C++ programs as the processing vertexes and files, TCP channels, or in-memory FIFOs as the edges. The Dryad framework itself takes care of scheduling execution of vertex programs, optimizing the level of concurrency within a machine, delivering data to nodes that need it, and detecting and recovering from node or network failures. Dryad can also dynamically refine the DAG during execution to extract more parallelism and make the best use of limited network bandwidth.

The DAG stream-processing model is more general than MapReduce, and although it is somewhat more complicated to program for, it is also more flexible.

Dryad also achieves some performance gains by exploiting pipelined parallelism. 2 jobs connected by an edge running on the same machine can actually be run in the same process and use a FIFO queue for communication, which is vastly faster than using a temporary file accessible over the network. Connected jobs running simultaneously on different machines can use a TCP stream to communicate.

Flaws

Dryad requires the user to explicitly specify the type of each edge. It would be useful for Dryad to automatically detect opportunities for pipelined parallelism and schedule tasks on the same machine or at the same time opportunistically to take advantage of faster communication channels, falling back on temporary files if necessary.

There are even faster communication channels than a main memory FIFO available on certain types of hardware, such as on-CPU addressable SRAM in the Cell processor, or GPU-attached memory. If two successive vertexes are both GPU programs, they could take advantage of the locality of data in the GPU's memory without having to waste time moving the data in and out of main memory.

Finally, it is unclear how well Dryad deals with heterogeneity of computing resources. The authors note that vertexes and channels can specify "hard constraint" and "preference" listings for computers on which to run, but such mechanisms are useful only for static requirements (e.g. a vertex must be run on a machine with a GPU, for example), not for dynamically balancing load (it is mentioned that the Dryad scheduler does not cooperate well with other applications sharing a cluster).

Applicability

Dryad provides a more general model of computation and some more optimized communications channels than conventional MapReduce, at the cost of some increased complexity for programmers. While the faster communications channels are a welcome improvement (somewhat hampered by the need to manually specify their use), it is not clear whether there are many use cases where the DAG model of stream processing fits, but neither MapReduce nor an existing well-optimized distributed system (like a parallel DBMS) does.

Summary
Paper presents a general purpose distributed execution engine, named Dryad, to exploit coarse-grain data parallelism in applications. Dryad handles job scheduling and provides fault tolerance in communication and computation in a distributed environment. Jobs in Dryad are represented as communication DAGs where vertices are programs and edges represent data flow. Programmers specify the data-flow graph to describe the application's communication pattern to achieve enhanced performance.

Problem
In distributed execution engines like MapReduce, Condor, system dictates the communication graph followed by the execution and thereby restricting communication flow and hence the performance as well. In Dryad, programmer can use his knowledge of system characteristics and program structure to tune the data-flow graph, used for the execution, to achieve higher performance than what a generalized and restrictive execution engine would provide.

Contributions
The distributed execution platform provided by Dryad allows a programmer to achieve high performance by controlling the data-flow graph used for the execution, while still keeping the system fairly programmable. Dryad programmers use a Graph Description Language and various graph composition operators to specify the data-flow for the application. Dryad provides a very flexible channel abstraction for communication among the vertices in the graph. The communication protocol, used by a channel, can be specified in the graph itself. Unlike MapReduce, Dryad allows arbitrary communication DAG with multiple inputs and multiple outputs. Job manager in Dryad maintains the data-flow graph and schedules running vertices as computers become available. Dryad uses the DAG provided by programmer as skeleton and dynamically generates data-flow graph for the application based on size of input/intermediate/output data, run-time network topology and other constraints specified by programmer. This makes the system highly scalable since sections of data-flow graph get replicated automatically as the data being operated on scales up.
When multiple vertices run on same computer, Dryad opportunistically encapsulates them in a single process as threads and these threads then communicate through shared memory FIFO. The generic channel abstraction exposed by Dryad allows it to employ this optimization as necessary.
Fault tolerance in Dryad is achieved by re-execution of deterministic computation performed by a dead vertex.
Dryad maintains a stage topology for a job to track its progress and to employ further run-time optimizations. Each vertex belongs to a stage and each stage has a Stage manager, which receives callback from deamon running on servers. Stage manager can start duplicate vertices when it spots straggler vertices. It also helps to achieve run-time graph refinements.

Flaws
Instead of cooperating with jobs already running on machines, greedy job manager of Dryad assumes that it is the only job running in the entire cluster. Dryad requires dedicated resource allocation to achieve high performance guarantees. It cannot be used to utilize idle CPU cycles in a grid of computers.

Applications
Dryad allows fine control over data-flow graph of the execution while providing an easy to program distributed execution platform. Any data-parallel application which wants to achieve high performance can use Dryad instead of other distributed execution engine which provide a generic interface since these severely restrict any kind of system specific optimizations.

Summary: With Dryad, Microsoft presents a dataflow graph-based distributed batch computation specification language. They aim to abstract away the ugly details of both concurrency in general and many of the difficulties we have observed in classical distributed systems.

Problem: The primary problem being addressed in this paper is the high level of difficulty that typically comes with any implementation of multi-threaded applications and asynchronous coordination of nodes in a system. Wherever there is a well-known pattern for solving a difficult problem, there is the potential for abstraction. The Dryad team attempts to provide that ease of use while balancing the conflicting goal of maintaining maximum flexibility.

Contributions: The most appealing aspect the Dryad system is the very intuitive model of computation that such a framework intrinsically forces a developer to consider. While the Dryad designers place an emphasis on not unnecessarily locking in a particular programming paradigm (as compared with MapReduce), the use of a graph-based methodology naturally invokes a different mental image of the program or system being created.

Flaws: This may sound like ... whatever the inverse of a backhanded compliment is ... but it seems like they may have put too much effort into making Dryad simple and easy to use. I should clarify. The simplicity and low barrier to entry that they describe implies that there's a huge audience just waiting in the wings with application ideas, but the reality may be more that anyone who's in a position to use a system like Dryad probably already carries a good deal of knowledge about the problem and how it needs to be addressed. Of course, my argument neglects the audience that may exist further into the future, the internal benefit to Microsoft, and the fact that it's undeniably a good thing thing I'm dissing.

Applicability: To pick up on the futurist angle, one can imagine a Star Trek-style omnipresent computer with which everyone casually interacts on a regular basis; whenever a person needs something done, they say "computer, do A and B and C for me," (with A, B and C being complex and inter-operating tasks) and the computer reports back the results to the person wherever he or she might be when the task is complete. Dryad may or may not be the precursor to such a system, but it certainly provokes the imagination into such excesses more than most of the systems we've discussed previously.

Summary
This paper presents a new system, Dryad, which attempts to improve parallel programming in distributed systems by providing a new framework with which programmers can structure programs. It evaluates the performance of the system, and the ease with which developers new to the system can use it.

Problem
The main problem addressed by the paper is the problem of running distributed parallel applications that can fully take advantage of a distributed system. Various other systems, such as Condor and Google's MapReduce, have attempted to make this easier through abstraction and providing a new way of structuring problems. Dryad attempts to do the same.

Contributions
The paper presents several contributions. The main one is a distributed execution engine which can take full advantage of a distributed system and hides many of the difficulties of programming parallelism from the developer. It proposes a new way of thinking about parallel programs, which involves directed acyclic graphs, where vertices consist of code to be run and edges represent data flow between executions. Since it can be difficult for programmers to move to a new programming paradigm, the authors also developed a new “scripting” language that provides another layer of abstraction between the system and the programmer.

Flaws
I thought the authors were a bit misleading about how much a developer has to understand about the underlying workings of Dryad. They originally claimed that an outside programmer could write programs to run on the system with very little knowledge of Dryad, but I got the impression that a new user would need to make significant changes in order to program in this paradigm. It also seemed like Dryad didn't do a great job of handling partial failure. It has several nodes that, if they fail, will halt the entire system (such as the job manager and the name server). It also has the property that if a node doing a job dies, all nodes related to that node will also crash.

Application
I think Dryad has limited applicability to problems today. First of all, there is the programmer usability problem mentioned above. Second, there are several aspects of the system that limit its applicability. For example, a developer needs to know the hardware of the nodes in the system in order to correctly write an application. While this might be feasible in a single company's network, it would be difficult to use this in a heterogeneous environment.

Summary
Dryad is a scalable framework for running data-parallel applications. Dryad handles communication, scheduling, and failures.

Problem
Writing distributed programs is difficult. Some tools exists which handle some of the harder problems of failure handling and work distribution, but they are not general enough or are not optimized for high speed LANs.

Contributions
Dryad jobs are specified used a directed acyclic graph of data dependencies. Other systems, such as Condor have similar capabilities, but Dryad is optimized for better performance on a high speed LAN. In addition to shared files, Dryad provides TCP pipes and shared memory for communication channels. These channels provide better performance if nodes are on the same computer or on the same high speed network. Dryad’s graph description format is powerful enough to support many different workflows, and it is conducive to performance tweaking. The graph description can also be changed at runtime in order to make performance optimizations on the fly as more information is know about input size and cluster size. Dryad is also designed to scale from a single multicore computer up to a cluster of thousands of nodes.

Flaws
I felt that there were some flaws in the presentation of the abstraction that Dryad provides. There doesn’t seem to be a clear separation between the low level hardware details and the high level flow of the job. The authors adjusted the number of nodes and the amount of aggregation in their example Dryad jobs based on the size of the cluster and the number of cores each computer had. The failure of a node may or may not be dependant on the failure of other nodes, depending on which type of communication channel is used. Nodes may or may not be deterministic, depending on the layout of the graph.

It would have been nice to see some more examples of applications that are difficult to do using map reduce.

Discussion
Dryad is used by successfully be Microsoft for data parallel applications. Despite some of the tuning needed for the Dryad graphs, Dryad does make it much easier to create distributed applications. It also appears that it would be very easy to build a map reduce API similar to Google’s map reduce on top of Dryad. The paper also mentions the Nebula scripting language, which is built on top of Dryad and hides much of the implementation details. This is probably a much more programmer friendly interface. It seems likely that most of the time a programmer would use one of the higher level APIs built on top of Dryad such as map reduce or Nebula, and only use Dryad directly if an application needs to be optimized.

Summary:

This paper is a description of Dryad a distributed computation system for applications that have coarse grained data parallelism.

Problem:

The paper aims at providing a programming model for writing efficient parallel applications. They aim at making this model to be simple to understand and focus on scalability and reliability of applications built using this model. Their aim is to let the developer think of the inherent data parallelism present in the application and to design the application accordingly.

Contributions:

One of the key aspects of the model is that is provides a certain level of abstraction for writing scalable applications. This is because the amount of system resources that is available is not known when the code is written. The developer need not worry about concurrency issues like threads and fine grain concurrency control. The system handles these issues and takes care of job scheduling and resource allocation as well.

The other interesting aspect of this model is that the developer has a good amount of freedom in designing the graph model and can specify any directed acyclic graph with varied communication patterns between the vertices of the graph. The model also allows him to have any number of inputs and outputs from each vertex. This is a special feature of this model.

The paper also talks about channel abstraction, whereby communication between vertices is represented by channels which could be TCP pipes, shared memory or temporary files. This abstraction is very convenient for the programmer as each program in a vertex reads and writes its data in the same way regardless of the method by which a channel serializes its data.

Finally the system has a runtime graph refinement mechanism which lets the system optimize the graph by modifying the inputs and outputs of each node. This refinement minimizes data transfer thereby saving network bandwidth.

Applications:

1. This programming model can be used for any data intensive parallel task with dependencies between various stages. The user can specify the DAG in a configuration file and then submit the job to the job manager.

2. This model could be incorporated in a parallel database to evaluate the various stages of a SQL Query with a large number of join operations.

3. This programming model could be evolved into a language for designing distributed applications of this nature.

Flaws:

In their failure policy they mention that if a vertex fails then it closes its pipes which cause it to propagate errors to the vertices that serve as its input as well its output. This causes the entire connected component of the graph to re-execute. However if the user had a relaxed failure requirement then perhaps the system could have made progress even though the vertex failed. Perhaps the programming model should provide the user the option of specifying a custom failure policy.

This paper describes ‘Dryad’ - a distributed execution engine for coarse grain parallel applications with a focus on providing simple programming model, reliability, efficiency and scalability to the applications.

Previous research done in this area tried to solve the problem either in software or in hardware where they tried to have stream programming - a programming model with limited parallelism. Some of the approaches in the software side of the research includes 1) Click which is similar to DRYAD but executes every thing in one process - so it doesn’t take any benefit from multi-processor design 2) Dataflow which similar to Dryad but is not designed to run on large clusters and no support for programming models for large graphs and cannot tolerate machine failures 3) Parallel databases: Many schemes such has data partitioning, pipe-lined and partitioned parallelism, hash based distribution are already in parallel databases but compared to Dryad it gives less encapsulation capabilities and cannot express irregular computations. 4) Continuous Query systems which does not provide high through put for distributed programs as Dryad provides. Dryad borrows most its concepts from these existing approaches but is designed to provide generic programming model so that developers can write efficient parallel and distributed applications, scale up to large machines and to provide a solution for the limitations present in most of the approaches specified above.

Contributions of the paper

One good point is that, unlike other programming models such as Map Reduce which force the programmer to write the programs according to specified rules, Dryad provides independence for the programmer to explicitly specify the flow of the application, required inputs and outputs.

It provides a generic low level programming since there is no programming model which can suit needs for all the distributed programs. Also it this feature which helps to build interfaces on top of it in high level languages.

Future research in Dryad is trying to provide dynamic configuration changes in the graph depending upon the application control flow and data upon which the application is acting.

Job Manager intelligently tries to schedule jobs in the nodes which are near to the data to reduce the network bandwidth which is a good optimization.

Unification of graphs is a good feature in Dryad. This way it can compose different distributed computations on the cluster and provide optimizations such as clubbing different vertices together to reduce the data transfer between the vertices.

As the paper suggest there can be some machine learning that can try to get data access patterns and resource usage patterns so that jab manager can schedule jobs such that resources can be efficiently utilised.


Negatives

In this programming model programmer has to explicitly show the data dependency graph. Although author makes a note of this limitation in the paper,It would be good if the graph can be generated automatically.

At present Dryad can execute programs on a single data center but it would like to see this work extended onto WAN infrastructure.

At present granularity of Dryad application is very coarse - program level. It would add an advantage if the granularity is made more fine - example block level.


Applications:

The same scenario happened in databases world where relational databases used to be a norm. Because of huge amounts of data and different application demands people now are concentrating on NOSQL systems which give programmer flexibility to represent their data in the form they want. Dryad is good for any distributed and parallel application which has a good amount of parallelism build into it.

Post a comment