Hadoop Community
Meetup @ Beijing

Author: Junping Du (Tencent) & Wangda Tan (Cloudera)

Picture01.jpg

Picture02.jpg

Overview

On Aug 11th 2019, Hadoop developers/users
gathered together at Tencent’s Sigma Center office in Beijing to share their
latest works, with 12 presentations by engineers from Tencent, Cloudera,
Alibaba, Didi, Xiaomi, Meituan, ByteDance (Parent company of TikTok, Toutiao,
etc.), JD.com, Huawei. This is also first Hadoop community meetup hosted by
Apache Hadoop PMC members.

Attendees

We received tremendous numbers of participations to the
meetup. There’re 200 spots available for registration to attend this meetup
in-person, and spots got fully booked in 10 mins. We got 150+
attendees in-person, and 3000+ attendees participated online live
sessions.

We have participants from dozens of different companies and
universities in person, many of them are flying from Shanghai, Hangzhou,
Shenzhen and even San Francisco Bay Area!

Sessions

1. Hadoop Community Update And Roadmaps

Junping Du @ Tencent and Wangda Tan @ Cloudera talked about
Hadoop community updates and roadmaps.

Picture03.jpg

style='font-family:-webkit-standard;color:black'>Junping Du @ Tencent

 

Picture04.jpg

style='font-family:"Times New Roman",serif'>Wangda Tan @ Cloudera

Junping introduced recent trends in the storage field, such
as better scalability and moving to cloud. He talked about features like RBF
(Router Based Federation), improvements of NameNode scalability, Improvements
of cloud connectors and Ozone.

Wangda talked about recent trends in the compute field, such
as better scalability, moving to clkoud-native environment, containerization
works and support of Machine-Learning use cases. He talked about global
scheduling framework for better scheduling throughput and placement quality.
Recent containerization works in YARN such as runc, interactive docker shell.
And YARN-on-cloud initiatives from community such as autoscaling, graceful
decommissions, etc. Wangda also talked about Submarine and its release plans.

At last, Wangda looked back at releases in 2018/2019, and
shared tentative release plan of Hadoop in 2019. Such as 3.1.3, 3.2.1 and
what’s new coming to 3.3.0.

2. Ozone: Hadoop native object store

Sammi (Yi) Chen @ Tencent talked about native object store
project from Hadoop community.

Picture05.jpg

Ozone is a strong-consistent distributed object store
service. Like HDFS, Ozone has same level of reliability, consistency and
usability. It supports S3 interface, so it is not only useful to on-prem
big-data workload. It is also a good option to move big data to cloud.

Sammi talked about architecture of Ozone, and what’s new in
Ozone 0.5 release.

3. YARN 3.x in Alibaba

Tao Yang from Alibaba talked about Hadoop use cases in
Alibaba. He also talked about how new features in YARN 3.x being used to solve
use cases. Tao talked about features like preemption, scheduling, resource
over-commitment, scheduling diagnostic, mixed deployment of online/offline
workload. Tao also talked about how new features in YARN help to better run
Apache Flink on YARN.

Picture06.jpg

Tao talked about many interesting features such as
MultiNodeLookupPolicy, which can help schedule jobs on a pluggable node sorter.

4. HDFS Best Practices learned from Didi’s production environment.

Hui Fei from Didi talked about HDFS best practices learned
from Didi’s large scale (hundreds of PBs) production environment.

Hui first talked about storage use cases and scale in Didi’s
environment. Then Hui talked about functionalities and improvements Didi’s
Hadoop team built on top of Hadoop HDFS 2.7.2 such as: Security, NameNode
Federation, Balancer, etc.

Picture07.jpg

Hui also talked about the status of upgrading production
cluster based on Hadoop 2.7.2 to Hadoop 3.2.0. The primary driver of upgrade is
to save storage spaces. Didi wants to use features like Erasure Coding in
Hadoop 3.x.

Didi has upgraded a test cluster (100+ nodes) from 2.7.2 to
3.2.0, has a backup cluster with 2k+ nodes run Hadoop 3.1.1 and will rolling
upgrade it to 3.2.0. There’s a primary cluster with 10K+ nodes (with 5
namespaces), will start to upgrade to 3.2.0 starting Oct

5. Submarine: A one-stop, cross-platform machine learning platform

Xun Liu @ NetEase and Zhankun Tang @ Cloudera talked about
background, existing status and future of Submarine project.

Picture08.jpg

style='font-family:-webkit-standard;color:black'>Zhankun Tang @ Cloudera

style='font-family:"Times New Roman",serif'> 

Picture09.jpg

style='font-family:"Times New Roman",serif'>Xun Liu @ Netease

Machine learning includes many components like
data-preprocessing, feature extraction, model training/serving/management,
distributed workload management. Submarine project started by Hadoop community
is targeted to achieve these goals by focusing on Notebook experiences. With
Submarine, data scientists or machine learning engineer don’t need to
understand lower-level platform such as YARN, K8s, Docker container.

Zhankun showed a new feature called mini-submarine which
allows developers try Submarine locally without installing a YARN cluster.

Xun did demos for:

style='font-family:"Noto Sans Symbols"'>●     
Integration of Submarine + Zeppelin
notebook.

style='font-family:"Noto Sans Symbols"'>●     
New Submarine web UI to allow data scientists style='color:black'> to run jobs and manage models, etc. in the unified user
experiences.

Xun also talked about companies which are reported using
Submarine in production. Such as NetEase, Linkedin, Dahua, Ke.com, JD.com.

6. Hadoop Improvements in Xiaomi

Chen Zhang and Kang Zhou from Xiaomi talked about how Hadoop
is being used in Xiaomi. They talked about improvements of HDFS’s performance
and scalability; Problems/Solutions when trying to platformize YARN.

For HDFS side, Chen talked about their improvements of HDFS
federation, such as lower the business impact when upgrading single NameNode to
federated NameNode. They have also improved NameNode Performance, which now
allows supporting 600 millions of objects (files + blocks) in a single
NameNode.

Picture10.jpg

In YARN, Kang talked about usability improvements in YARN.
Such as RMStateStore/History Server, etc. Also, he talked about multi-cluster
management tools such as a unified client/RM-UI for multiple clusters. Kang
also talked about improvements they have done for scheduling optimization like
cache Resource Usage, improvements of utilization and preemption, etc.

7. Key Customizations of YARN @ ByteDance

Yakun Li from ByteDance talked customizations of their YARN
cluster to handle extra large scale, multi-clusters environment, Including:
utilization improvements, stabilization, optimizations for
streaming/model-training environment, and multi datacenter issues, etc.

Picture11.jpg

For scheduling, Yakun also talked about how they implement
Gang Scheduling in YARN, which do scheduling for application instead of node.
And it can achieve low-latency, hard/soft constraints. He also talked about
implementation of multi-thread version FairScheduler which can push number of
container allocation per second up to 3k.

In mixed-workloads (Batch, Streaming, ML) deployment part,
Yakun talked about they have adopted Docker on YARN support to isolate
dependencies. Support CPUSET/NUMA, temporarily skip nodes which have too high
physical utilizations, etc. All these efforts can help mixed workload runs well
in same cluster.

8. YuniKorn: A New Unified Scheduler for Both YARN and K8s

Weiwei Yang and Wangda Tan from Cloudera talked about their
works about a new scheduler named YuniKorn ( href="https://github.com/cloudera/yunikorn-core">https://github.com/cloudera/yunikorn-core)
and how it can benefit both YARN and K8s community.

Picture12.jpg

style='font-family:"Times New Roman",serif'>Weiwei Yang (Right) and Wangda Tan
(Left) from Cloudera

Scheduler of a container orchestration system, such as YARN
and Kubernetes, is a critical component that users rely on to plan resources
and manage applications. They have different characters to support different
workloads.

YARN schedulers are optimized for high-throughput,
multi-tenant batch workloads. It can scale up to 50k nodes per cluster, and
schedule 20k containers per second; On the other side, Kubernetes schedulers
are optimized for long-running services, but many features like hierarchical
queues, fairness resource sharing, and preemption etc, are either missing or
not mature enough at this point of time.

However, underneath they are responsible for one same job:
the decision maker for resource allocations. They mentioned the need to run
services on YARN as well as run jobs on Kubernetes. This motivates them to
create a universal scheduler which can work for both YARN and Kubernetes and
configured in the same way.

In this talk, Weiwei and Wangda talked about their efforts
of design and implement the universal scheduler. They have integrated it with
to Kubernetes already and YARN integration is working-in-progress. This
scheduler brings long-wanted features such as hierarchical queues, fairness
between users/jobs/queues, preemption to Kubernetes; and it brings service
scheduling enhancements to YARN. Most importantly, it provides the opportunity
to let YARN and Kubernetes share the same user experience on scheduling big
data workloads. And any improvements of this universal scheduler can benefit
both Kubernetes and YARN community.

9. HDFS cluster improvements and optimization practices in Meituan Dianping

Xiaoqiao He from Meituan Dianping talked about Hadoop
cluster scalabilities now. Their Hadoop cluster keep growing since 2015. By
far, there’re more than 30k nodes in the Hadoop clusters.

Picture13.jpg

He shared many details and practice about the infrastructure
of physical deployments, especially on solution for cluster across multiple
regions. In the last part, Xiaoqiao shows some practices for optimizing HDFS
cluster, such as: improve the Namenode restart process and rebalance for
Namenode workload, etc.

10. Evolution of YARN in JD.com

Wanqiang Ji from JD.com talked about how YARN evolves to
support JD.com’s business needs.

Picture14.jpg

In the last 3 years, maximum number of nodes in a single
YARN cluster scales from 3k, 5k, 10k to 16k. Internally there’re works to
balance resources between YARN/K8s cluster. Also there are improvements of
container eviction policies to make sure nodes won’t crash or restart when
machine’s physical utilization grows above a certain level.

11. Lessons learned from large scale
YARN cluster operation @ Tencent

Jun Gong and Dongdong Chen from Tencent talked about their
works to support large scale YARN cluster inside tencent.

Picture15.jpg

Gong Jun @ Tencent

Picture16.jpg

Dongdong Chen @
Tencent

Jun and Dongdong shared inside Tencent, they widely used SLS
to figure out bottleneck of scheduler, many of the scheduler improvements have
contributed back to the community. After optimization, in their production
cluster, they have 2k+ queues, 8K+ nodes, 5k+ concurrent jobs. And they can
achieve 3k+ container allocations per second, and more than 100 millions
container allocations per day.

Also, Jun and Dongdong shared how they uses YARN CGroups
parameters to fine-tune CPU/Memory/Network shares for launched YARN containers
in a multi-tenant cluster.

12. Run Spark and Hadoop on ARM

Rui Chen and Sheng Liu from Huawei shared their works to run
Spark and Hadoop on ARM.

Picture17.jpg

Rui Chen

 

Picture18.jpg

Sheng Liu

 

Rui and Sheng shared the motivation of running hadoop and
spark on ARM platform which is for high performance and power efficiency. After
that, they went ahead to share status of ARM support for hadoop and spark and
details of building release Tarball on ARM platform include parameters, and
issues. In the last part, they introduced how hadoop/spark release work can
make sure proper testing for arm platform and they were building a community
called OpenLab to make sure the process more smoothly.

 

Acknowledges

Thanks everyone for contributing this successful event in
one way or another, such as following speakers:

Sammi Chen, Jun Gong and Dongdong Chen from Tencent,

Weiwei Yang, Zhankun Tang from Cloudera,

Wanqiang Ji from Jingdong,

Tao Yang from Alibaba,

Chen Zhang and Kang Zhou from Xiaomi,

Hui Fei from Didi,

Rui Chen and Sheng Liu from Huawei,

Xiaoqiao He from Meituan Dianping ,

Yakun Li from ByteDance,

and Xun Liu from Netease.

And especially thanks Chunyu Wang, Summer Xia, Katty Ma for
organizing the meetup!