Big Data machine-learning library used for scalable in-database analytics

Forest Hill, MD –22 August 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® MADlib™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache MADlib is a comprehensive library for scalable in-database analytics. It provides parallel implementations of machine learning, graph, mathematical and statistical methods for structured and unstructured data.

"Graduating as a Top-Level Project is a very important milestone for Apache MADlib," said Aaron Feng, Vice President of Apache MADlib. "During the incubation process, the MADlib community worked very hard to develop high quality software for in-database analytics, in an open and inclusive manner in accordance with the Apache Way."

MADlib grew out of discussions between database engine developers, data scientists, IT architects and academics interested in new approaches to scalable, sophisticated in-database analytics. These discussions were written up in a paper from VLDB 2009 [1] that coined the term "MAD Skills" for data analysis. The MADlib software project began the following year as a collaboration between researchers at UC Berkeley and engineers and computer scientists at Pivotal (formerly EMC/Greenplum). In September 2015, MADlib joined the ASF community as an incubating project.

MADlib is deployed on a wide variety of industry and academic projects across many different verticals, including automotive, consumer, finance, government, healthcare, and telecommunications.

"MADlib was conceived from the outset as an open-source meeting ground for software developers, computing researchers and data scientists to collaborate on scalable, in-database machine learning and statistics," said Joe Hellerstein, Professor of Computer Science at UC Berkeley, Co-Founder and Chief Strategy Officer at Trifacta, and one of the original authors of MADlib. "It has been great to witness the growth of the MADlib community and codebase as an ASF incubating project, and I look forward to this continuing as a Top-Level Project."

"At Pivotal, we have seen our customers successfully deploy MADlib on large scale data science projects across a wide variety of industry verticals," said Elisabeth Hendrickson, Vice President, R&D for Data at Pivotal. "As MADlib graduates to a Top-Level Project at the ASF, we anticipate increased adoption in the enterprise given the mature level of the codebase and the active developer community."

"The potential of the Apache MADlib project is unbounded," said Jim Jagielski, Vice Chairman of the ASF. "The ability to perform in-depth and detailed analytics, on both structured and unstructured data, using SQL enables MADlib to be applicable in scenarios where others simply can't compete. As not only interest in, but real-world usage of, machine learning becomes common place, MADlib joins the growing roster of Apache projects that define innovation."

"Apache MADlib is a great example of the diversity at Apache," said Ted Dunning, Apache MADlib Incubator Mentor and Member of the ASF Board of Directors. "MADlib does state-of-the-art machine learning, but does as an inherent part of a database. This is a radical approach that can provide important design flexibility. I am excited to see MADlib become a fully fledged project at Apache."

"New participants are more than welcome to join the project," added Feng. "We enthusiastically look forward to working together with all contributors to Apache MADlib in order to advance the state-of-the-art of scale-out data science tools."


Availability and Oversight
Apache MADlib software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache MADlib, visit and

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 650 individual Members and 6,200 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, Inspur, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit and

© The Apache Software Foundation. "Apache", "MADlib", "Apache MADlib", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #