- Notifications
You must be signed in to change notification settings - Fork2
[GSoC] Distributed Data Structures - Collections Framework for Chapel language
License
LouisJenkinsCS/Distributed-Data-Structures
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repository hosts the first framework for distributed data structures for theChapel programming language. Here we introduce a 'Collections' module, based on Java'sCollections framework, as well as two scalable data structures (one ordered, one unordered).Documentation can be seenhere.
Changelog for Chapel version 1.16 can be foundhere, slide 13 - 17 contains the highlights of the project.
This project was made possible through the Google Summer of Code program, who provided me an amplestipend to live on, gave me the once-in-a-lifetime chance to design and develop new solutions in thearea of distributed computing (PGAS in particular), provided the more-than-necessary resources (Cray-XC40cluster), and a way to learn more exciting and useful knowledge. As well, I would like to thank both of mymentors,@e-kayrakli and@mppf, who Ihave had the honor to server under. Finally, I would like to thank the Chapel project itself.
For documentation purposes, I will list the most important information that should be taken into account for GSoCfinal evaluations.
While I have had many issues with Chapel's compiler so far, I will only list the ones that are relatively significant.
- LICM (Loop-Invariant-Code-Motion), a compiler optimization, causes tuples to cause undefined behavior due to improperhoisting. This issue has caused me a large amount of grief and countless hours (estimated to cause me to lose a weekof development time), and it deserves to be mentioned first. All-in-all I am glad it was caught while developing andnot, say, for a unsuspecting user.Link.
- Code generation phase inaccurately performed comparison directly on references. In the compiler, references are actuallypointers, and so during comparison it would compare it by pointer to something else. Now if you were to perform thiscomparison, it could lead to code inaccurately taking branches of code it was not meant to. This only caused me to losea day.Link.
- Returning a tuple from an overloaded method, which would result in dynamically dispatching the method at runtime, wouldcause the compiler to crash if not captured at the callsite. This is no normal internal error, where you get a decenterror message and a line number, this resulted in an actual segmentation fault during compilation. Currently, this bugis still not patched and a careless user can trigger it by not capturing the return value of
remove
. This caused meto lose a weekend of development time.Link.
A spin-off project that was created from the desire to solve a core problem to creating scalable data structures:performing atomic operations on class instances. It has played an important role in experimentation, and it wouldbe invaluable as a tool to create more scalable distributed data structures, and is useful even for any arbitraryapplication. This sub-project could not be completed as it would require some Chapel runtime and compiler changesto make it work more efficiently, but even now it proves to be a scalable solution, and as such deserves to bementioned here. There will be more improvements made on this, as it will become my next (unfortunately non-funded)project, as it will be another first and novel solution.
Discussion - Is currently on hold as the GSoC project is unfinishedand is over larger priority.
Pull Request - Currently is closed as this requires a lot more work.
Repository - It has been moved to its own repository and strippedfrom this project.
The 'Collections' module is an attempt to bring Java-like data structures to distributed computing. Prior to this module, theonly available data structure was theList
, a singly-linked list that offered no parallel-safety nor took advantage of distributingmemory across nodes. The pattern before Collection was to create your own distributed data structures, specific to your own needs.While this makes sense for raw performance, the core focus of Chapel is to bring abstraction and make it easier to make distributeddata structures; tie on the fact that, while I'm not trying to exaggerate my achievements but, making distributed data structures ishard. With abstraction comes a lack of control, as many fine-grained communications are made implicitly betwen nodes without realizing,easily degrading performance when not needed. Developing these data structures is also time consuming, and developing them each time willdetract from the time needed to create appplication software.
Not only have I created two excellent data structures, they both make use of low-level techniques that the user should avoid, such asprivatization (allocating a clone on each node that is used to prevent excess implicit communications caused by accessing class fields).Furthermore, not only do my data structures perform well, they are also very useful, as they offer abstractions people may commonlyrequire, such as iteration, reductions, utility methods, etc. Lastly, the Collections module offers something that was also lacking: aninterface. Currently Chapel does not support interfaces (yet...), and this will be the first. We provide a contract between the user and developer that a certainobject will support insertion, removal, and iteration, with lookup, bulk insertion, and bulk removal being added for 'free'. This allowsthe user to reliably use any Collection, and it allows developers who do not to implement specialized versions of the 'for free' methods.
Discussion - The original discussions for the Collections module.
Pull Request - The initial PR that was merged upstream. Needed to perform nightlytesting.
Pull Request (Improvements) - The next PR that contain multiple improvements to the data structures.
All benchmarks performed on a Cray-XC40 cluster, at 64-nodes, with Intel Broadwell processors.
Provides a strict ordering without sacrificing too much performance. Supports insertionand removal from both ends, allowing FIFO, LIFO, and a Total ordering, which ispreserved across all nodes in a cluster, and employs a wait-free round-robin approachto load distribution that ensures fairness in memory, bandwidth, and computation.
Disclaimer: The deque provided, while scalable, rely heavily on network atomics.The benchmark results are produced using said network atomic operations.
With performance that scales both in the number of nodes in a cluster and thenumber of cores per node, we offer a multiset implementation, called a 'Bag',which is a medium that allows storing and retrieving data in any arbitrary order.This type of data structure is ideal for work queues as it employs it's own loadbalancing, and offers unparalleled performance.
We compare our data structures to a naive synchronized list implementationas that is all that is available in the Chapel standard library.In all cases, the data structures scale and outperform the naive implementation.
Implementation | Performance over Naive (at 64 nodes) |
---|---|
SynchronizedList | 1x |
DistributedDeque | 63x |
DistributedBag | 403x |
Implementation | Performance over Naive (at 64 nodes) |
---|---|
SynchronizedList | 1x |
DistributedDeque | 123x |
DistributedBag | 651x |
About
[GSoC] Distributed Data Structures - Collections Framework for Chapel language
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.