I am very excited to announce that Apache Incubator Samza 0.7.0 has been released. Samza is a distributed stream processing framework. It uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management. The project entered Apache Incubator in 2013 and was originally created at LinkedIn, where it's in production use. The project is currently under active development from a diverse group of committers.
In all, 156 JIRAs were resolved in this release. Notable work done includes:
- Initial import of code into Apache. (SAMZA-1)
- Upgraded to YARN 2.2 from YARN 2.05-alpha. (SAMZA-9)
- Numerous state management bug fixes.
- Java 7 support. (SAMZA-16)
- A ton of work on documentation, tutorials, hello-samza, and Javadocs.
- Scala 2.10 support, and removal of support for Scala 2.8 (SAMZA-128, SAMZA-160)
- One-off resets for input stream offsets. (SAMZA-180)
- Upgrade to support Apache Kafka 0.8.1, which includes log compaction. (SAMZA-180)
- A consensus based shutdown API. (SAMZA-253)
- A pluggable message sorting API. (SAMZA-2)
We've also made a lot of community progress during this release:
- Added 4 new committers (Garry Turkington, Martin Kleppmann, Zhijie Shen, and Yan Fang).
- Accepted patches from 14 distinct contributors.
- Presented on Samza's architecture and usage.
- Had over 1000 emails to the developer mailing list.
Even after all this work, there's still a lot to be done. In our next release (0.8.0), we're planning to focus on performance. This work includes:
- Switching Samza's state feature to use RocksDB instead of LevelDB. (SAMZA-236)
- Supporting pluggable partition-container assignment strategies. (SAMZA-71)
- Improving consumer performance. (SAMZA-245)
- Upgrading Samza's YARN UI. (SAMZA-32, SAMZA-237, SAMZA-290)
I'd like to close by thanking everyone who's been involved in the project. It's been a great experience to be involved in this community, and I look forward to its continued growth.