Apache Hama announces v0.7 Release!
Apache Hama team is pleased to announce the release of Hama v0.7 with new features and improvements.
Hama is a High-Performance BSP computing engine, which can be used to perform compute-intensive general scientific BSP applications, Google’s Pregel-like graph applications, and machine learning algorithms.
What are the major changes from the last release?
The important new feature of this release is that support the Mesos and Yet Another Resource Negotiator (YARN), so you’re able to submit your BSP applications to the existing open source and enterprise clusters e.g., CDH, HDP, and Mesosphere without any installation. In addition, we reinforced machine learning package by adding algorithms such as Max-Flow, K-Core, ANN, ..., etc.
There are also big improvements in the queue and messaging systems. We now use own outgoing/incoming message manager instead of using Java's built-in queues. It stores messages in serialized form in a set of bundles (or a single bundle) to reduce the memory usage and RPC overhead. Unsafe serialization is used to serialize Vertex and its message objects more quickly. Another important improvement is the enhanced graph package. Instead of sending each message individually, we package the messages per vertex and send a packaged message to their assigned destination nodes. With this we achieved significant improvement in the performance of graph applications. The attached benchmarks were done to test scalability and performance of PageRank algorithm for random generated 1 billion edges graph using Apache Hama and Giraph on Amazon EMR 30 nodes cluster. Note that the aggregators was used for detecting the convergence condition in case of Apache Hama.
What’s Next?
After a month of testing and benchmarking this version will bring substantial performance improvements together with important bug fixes which significantly improve the platform stability. We look forward to add more and more and see our community grow. The primary objective of the technical plans are:
- Add stream input format for listening messages coming from 3rd party applications, and incremental learning algorithms.
- Improve reliability of system e.g., fault tolerance, HA, ..., etc.
- More machine learning algorithms, such as ensemble classifier, SVM, DNN, ..., etc
Where I can download it?
The release artifacts are published and ready for you to download either from the Apache mirrors or from the Maven repository. We welcome your help, feedback, and suggestions. For more information on how to report problems, and to get involved, visit the Hama project website[1] and wiki[2].
[1]. Apache Hama Website: https://hama.apache.org/
[2]. Apache Hama Wiki: https://wiki.apache.org/hama/