Announcing the Apache Drill Beta Release, Self Service Data Exploration in Action

It is our pleasure to announce the 0.5.0 release of Apache
Drill. This is Drill’s first beta
release and the second in our iterative monthly release cycle. It includes more than 100 issues addressed
since last month’s release and more than 1,000 addressed since Drill’s
inception, this is a great release to start exploring your data, wherever and
whatever it is.

For more background on what Drill is about, check out the Drill overview or Drill
in 10 minutes. The 0.5.0 release
builds upon the huge 0.4.0 release so you should refer to last
month’s release for information on all the functionality available. Notable features included in 0.5.0 include
the following:

Drill now uses the Hadoop 2.4.1 APIs. This includes upgrading Parquet to use direct
memory and the ability to write larger Parquet files when using CREATE TABLE AS.
Improved JOIN planning when using HBase tables based on row
count approximations using region level statistics.
Improved handling of large sorts and out of memory conditions.
JSON projection pushdown, an all text JSON mode and boolean
short circuit. Each of these features
allow more flexibility when interacting with complicated JSON files
Substantial improvements in SELECT * handling when interacting
with schemaless data sources.
Creation of a self contained JDBC JAR file to ease access to
Drill from JDBC tools.
Fully distributed execution of all basic aggregates including
standard deviation and avg.

Drill will continue on its march towards GA with upcoming
monthly releases continuing to harden and expand Drill’s capabilities and
performance. Check out the release
notes, download
it, or better yet, make
your own fork and contribute back to the community. Together, we can make data available to everyone, anywhere.

-The Apache Drill Team