The third "Inside Infra" interview is with ASF Infrastructure Administrator Greg Stein, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.




"We've got about 200 different machines and each one runs something different"



PART ONE.


What is your name --how is it pronounced?

Greg Stein. "Gregg St-eye-n"


When people need to find you, are you at gstein@? Has that always been your handle for everything?

Ever since high school, actually. I was gjs@ for a bit in college, but went back to gstein@. I started at Google early April 2004, and Gmail launched on April 1, so I was able to get my work email ID, gstein@gmail. So it’s great, but also rather annoying, because there are a lot of Gary Steins and Gertrude Steins and George Steins, and I get all of their email ... I get plane tickets, hotel reservations ... I got a proposal from the Gates Foundation once. I had some crazy bitter angry lady yelling at her husband as they were getting divorced, and she could rant. I mean, wow: that lady had a pirate's mouth.


But she didn't have his email address.

Apparently not.


When and how did you get involved with the ASF?

I left Microsoft in 1998, and the product group I was working in was building WebDAV into various Microsoft products. I thought the concept of WebDAV was very cool, and wanted the Open Source world to have it. That meant writing a module for the Apache Web Server. I think it was September 1998 when I started posting to the Apache mailing list and looking at how to plug in a WebDAV module. That was Apache 1.3 at the time. I developed a module called mod_dav for Apache 1.3, And when we started Apache 2.0 in 2000, I donated the module to Apache, and it became a standard module in Apache 2.0.


I remember that: I did the press release for that way back when. I knew you were connected with mod_dav, but didn't realize the path as to how you got there. It's very interesting.

That's what brought me to Apache, when they started putting together the foundation: it was in the Spring of '99. I remember asking Roy if I could be one of the first members of the foundation, and Roy's answer was basically like, "We already had the set of people locked in. You'll probably get nominated and voted in at our first member meeting," which occurred in September 1999. So yes, I was in that first batch of new members rather than the original membership.

You've been a member of the ASF much longer than you've been involved with ASF Infra. What were the previous hats you were wearing at the ASF? You've been here for a while, and have had a lot of different configurations.


This is true. So I'm a committer on HTTPd (Apache HTTP Server) and then a PMC Member, an ASF Member. I helped start the APR (Apache Portable Runtime) project with some of the other Web server committers, we pulled that out of HTTPd and created APR, and we used that for 2.0. We used APR, whereas Apache 1.3 was essentially the combination of the two, one big code base. Then Justin Erenkrantz and I started Apache Serf, and that was a high performance C-based client library for HTTP. But we didn't have three people in the community, so it couldn't really be an Apache project. So we took it out of Apache and started working on it on our own, and then eventually Subversion started to use Serf, and so we got more committers on Serf, and the community kind of built up around it because of Subversion. So we ran Serf externally, but just like it was an Apache community, it was Apache licensed and so on. Eventually we wanted to move it back into Apache, and I don't recall off hand, but we went straight to a TLP from our external project back to Apache Serf.

Early 2000, it was January or February, (ASF co-Founder) Brian Behlendorf approached me about helping with the network protocol for this new version control system they were starting at CollabNet, because he knew my background in HTTP and WebDAV. That “V” stands for versioning. I got involved with the Subversion project that Spring. That was also run as a very egalitarian Open Source project, very similar to how we run stuff at Apache. I was really the only Apache person, but Karl Fogel just knows how to run a great community, and so all those values that we cherish in communities at Apache were part of Subversion from day one, but was run by CollabNet. I was hired in 2001 to manage their development team. Eventually, CollabNet wanted to turn it into a vendor-neutral thing that wasn't only CollabNet, so they started a small LLC called the Subversion Corporation. Once the IP was transferred to the Subversion Corporation, people said, "Okay, let's move to Apache," because nobody wanted to deal with the overhead of the Subversion Corporation. We approached Apache at the end of 2009, and Subversion became Apache Subversion. I was the first VP for that. I think that's the only VP hat I've worn.

In 2001, I was elected to the Board at the Members meeting, and in 2002, Roy decided to step down as Chair and said, "Oh, Greg should be Chairman." He just kind of threw me under the bus that way, but I agreed, and that's when I became ASF Chairman. I was chairman until 2007, which is the longest-running chairman. I think Brett Porter did four years.

I think it was 2009 when you hosted us at the Harvard Club and Doug Cutting was appointed Chairman, but he said he didn't really want to travel, do much press stuff, or be a face of Apache. Roy came to the rescue, threw me under the bus again, and said, "Greg can be Vice Chairman, and we'll have the Vice Chairman do all that stuff”. So I held the Vice Chairman role until September 2016, when I gave up my director position, the Vice Chairman position, and VP of Subversion, because that's when I became Infrastructure Administrator.

Over the years, I did a bunch of volunteer work for ASF Infrastructure. I helped out with what we call AP mail: adjusting moderators, changing aliases, things like that. So I've had AP mail access for quite a while when I was doing that. Upayavira wrote id.apache.org for people to review their Member records, change their passwords, etc. I helped him with some of that stuff. That was all written in Python, so I was able to help out.

Python before Python was popular.


I've been using Python since 1995, and I've contributed to Python itself. We set up the Python Software Foundation in 2001. When I say “we”, I mean myself and Dick Hardt from ActiveState. We took the Apache bylaws, and added a different class of membership to it so that companies would become... I forget what we called them, like corporate members or something. The normal people were called nominated members, as they were nominated by somebody else and voted in. But this gave corporations a vote at the table on the board and anything else that members would get a vote on. So the core of the Python Software Foundation came from Apache.

Back to ASF Infrastructure. In 2016 we had four people on staff in Infrastructure, and our volunteer VP of Infrastructure didn't have enough volunteer time to be able to provide support and management for those four people, plus we wanted to hire two more people. With six people, he was right out. So we spent a lot of 2016 trying to figure out how to create a “manager” for the Infra team. At the time the idea of an “executive director” type position was also thrown around, but a full-time position to manage four or six staff is completely overkill, and we certainly didn't have the budget for a full-time position. Somewhere around late August, I realized that there was an email that Ross (former ASF President Ross Gardler) sent and I thought, "I can do that. That's a half-time job. I'm certainly happy to do it. I've managed engineering teams before.” Now, infra's not an engineering team, they don't really develop products, but it's pretty close to engineering management. At a minimum, it's personnel management, which I've been doing since the '90s.

So I threw my hat in the ring. Ross ran it by the Board and the team, and nobody raised a strong concern, so in his authority as President, he went ahead and hired me half-time. It was the day of our Board meeting --I resigned all my positions, and we appointed my replacement for Vice Chairman and my Director position that day, both of which I believe were Sam. He filled my role as Director, and I started as Infrastructure Administrator.


What does “Infrastructure Administrator” mean? What does it entail: are you hands-on coding solutions like the rest of the team? Are you solving problems? What do you do?

I chose the title because I didn't want to be called “manager”: I didn't want to feel like I'm the boss. I wanted to help with the administrative side, make sure the guys get paid, deal with the invoices, handle what you might call back office kind of stuff, and let the team focus on what they do best, which is the system administration. (ASF Infrastructure Team Member) Daniel Gruno does some development work in addition. I do a little bit of development work. For me, it's more like where in my hobby time I might work on Subversion, but now my hobby time is coding Infrastructure type stuff, so it's not really part of my work duties. I deal with salaries, raises, bonuses, getting the payroll done, and for our contractors, getting them paid. I also deal with third party contracts for things like Travis CI, for lists.apache,org ... that's with PonyMail. I make sure that our vendors get paid, and our contractors and employees get paid.

How was the Infra team structured, and how many are in the team?


We have five full-time people that work on Infrastructure: all five are system administrators. Daniel Gruno does maybe 30% system administration and 70% tool development. We don't develop any products, because we're not an Apache community. We write tools, but don't actually develop any products. This is why PonyMail is in the Incubator: it was originally written by one of the people on Infra, but we didn't want to run that as an Infra community. With only five people, we don't really want to be a community lead or anything like that.

The joke is if somebody wants to move into my position, they lose half their salary, because my position is part-time. It's not really a promotion: it would be a loss to do anything. So unlike a corporation with 10,000 people on staff, career development is a little more difficult. It's really a job for people that enjoy Apache and enjoy our mission, and also enjoy working with the other people on the team.

Who does ASF infra serve?


Our primary users are all the communities at Apache. We've got over 200 communities, and those are the primary users. I don't like calling them “customers”, but in a corporate world, they would probably be considered our customers, and we serve those users. There's 8,000 people with accounts that are working on different projects, but the user base is way, way larger than that, because people can file JIRA tickets and work on the wiki and do things like that without actually being an Apache Committer. So the user base is even larger. Then you start looking at all the people subscribed to all of our mailing lists, and that number goes even higher. There's probably 10% of our work which is also supporting the administrative side of the Foundation itself.

For the Board, your role in PR, and Trademarks, and Legal, and the office of the President and various other operational type stuff, we spend 5-10% of our time. A lot of what we do applies naturally across all of the user base, because the foundation uses the same tool set as our communities. Subversion, mailing lists, JIRA, Confluence, etc. We help with account creation, the LDAP management, what sort of permissions people have to access different things...

One of the neat things that we've done, and I've actually had a couple of communities ask us about it, is our GitBox setup where our projects can use GitHub. But then we also mirror all that source code back to Apache so that we have a copy of it for provenance tracking. And in case GitHub does something dumb, we have our own copy of the code. Any changes made on GitHub get sent to our mailing list or get mirrored into JIRA. Our projects can see all the activity on GitHub, and it gets mirrored into our mailing lists where we prefer that our community work is performed.

That's actually a pretty cool feature that we've done at Apache.


It's interesting to see communities outside of Apache that emulate structures and processes and solutions that the ASF has created. It's cool to see it even happening on an infrastructure level. How does ASF infra differ from other organizations or other open source foundations?

Most of them don't really have teams. Most projects out there do their work on GitHub, and don't have their own source control. They don't have account management, they don't run mailing lists. We do all this stuff that most Open Source projects just don't deal with.

They also don't have the scale that we have.


Yes. Because they're one project, and we have over 200 projects. Most projects have some repositories hanging out on GitHub or on GitLab, or wherever else that they might host: if somebody wants to run a demonstration of that project, they buy their own virtual machine and AWS, and pay that out of pocket. At Apache, all of our projects can have virtual machines hosted by Infra, where they install their software for demonstration purposes. They can point people at that VM, so they can check out the product in live motion. So that ability to run VMs is also pretty unique to the Foundation. When you look at the Linux Foundation or the Eclipse Foundation, those are a little bit different. They're not a charitable organization like us. They're a 501(c)(6), which is really like a trade association.


Like a consortium.


Yes, a consortium. I believe that they do have infra teams, but their business model is quite different from ours. If you look at Mozilla, they have the Mozilla Foundation, but that's kind of a shell; Mozilla Corporation essentially runs everything, and the foundation is like a legal shell wrapped around the corporation.

You mentioned earlier that we have 200 projects: you're referring to 200 Top-Level Projects (TLPs), but we also have sub projects and initiatives. At Apache, we have more than 350 different activities going on --you guys touch all of those. It's not like there's any aspect of ASF that you're not involved with or you're not supporting.


That's correct. And I say 200 because I'm thinking mostly from a TLP thing.


Irrespective of the existence of sub projects, you're still dealing with other communities and projects: there's more than just the 200. Hats off to you guys. It's quite a lot of work.

We've got about 200 different machines and each one runs something different. Some companies have 50 copies of a machine that they'll start up in the cloud, running some container --we never do that. Each individual machine is configured one by one and they're all different. And so 200 machines to support the 350 initiatives. It's a lot of heterogeneous work and that can be kind of distracting, but it's also very interesting because we do support such a wide variety of stuff for our projects.

There's what, five Infra team members, and we have 350 projects and initiatives going on. That's a lot of stuff happening: is it non-stop?


Yeah, it's nonstop. That's why we went from four to six people, we were sort of treading water, but we weren't really able to move forward on a number of our longer term initiatives. So when we went to six people in November 2016, that made us a lot more hands-on, if you will. That meant that we could actually make some progress on this longer term work that we wanted to accomplish. Some of that is like
https://selfserve.apache.org/ , where people can get things done instead of filing a JIRA ticket and having us do the work for them.

Is that popular? Do people use it?


Oh, absolutely. When somebody opens a JIRA ticket to say, "Can I have this Git repository?” or “Can you create a JIRA Space for me?" we close the ticket and say, "Go to selfserve.apache.org". Before, where everybody would file a ticket for a Git repository or file a ticket for JIRA, file a ticket for Confluence, or whatever, we just close them all down now, and they use selfserve.apache.org instead. We simply won't do those things anymore. So selfserve.apache.org is actually quite handy. And then about four months ago we've added a feature called asf.yaml: it allows communities to control a lot of the finer grained aspects about how their repositories are used, like how do they publish Web pages from a repository, or if you make a change, where does the commit email go? Which mailing lists? Does it go to their development list? Or do they have a commit list? If somebody opens a PR on GitHub, where does notification of that go? Those used to all be tickets also, but people can deal with those just by editing a file in their repository now. So again, it reduces tickets and that's our goal where these routine tasks that all the different communities want to perform, we want to move those into a self-served mechanism so that we don't need hands-on all the time. And thus, we can support 350 different initiatives.

That's great to help empower the communities to take care of their own needs, whether they're minor or major, but that also encourages autonomy. So that's really helpful for you guys: you don't need to have a team of 40 people to support the day-to-day.


We do stay busy. You're talking about the influx and we get requests from people through email, through our Slack channel and through JIRA. Of course our monitoring system will tell us when something goes down, so our monitoring systems also give us more work to do, so it is kind of an endless string of queries. Depending on what the task is, each of those different channels is appropriate. For a quick task, hitting us up on Slack is totally fine, but if there's going to be several days of work, we like JIRA tickets so that we can track the work as it progresses.

How do you encourage the team? How do you keep them motivated? What were your challenges with such a huge load to carry: how do you keep everyone going?


One of the big benefits that we have for our team is actually that we're all remote, so we all sit on a Slack channel. We have a team-only channel that we use for communicating, "What's going on? What beer are you drinking today? What are you having for dinner?" I think about my days when I worked at Microsoft or at Google where I sat in the office by myself and it's a very individual experience that I used to have, but now, our team is there all the time on our channel. It's a very social experience: I think that makes for a much tighter team. And it provides a very different experience than what you get at a more “normal” company. That sort of team experience really helps keep people motivated.

People enjoy their jobs more. From a management standpoint, I can certainly say, "If people are sitting there talking about what they're going to make for lunch, there's a drag on the team and maybe we're not seeing the highest productivity possible," but I think that would actually run the counter. Our team is actually more productive as a result of this great team bonding. We have a conference call once a week for 30-60 minutes. And we don't really have to: the team knows what everybody's doing because we're all doing it right in front of everybody. We all get the commit messages. We have our Slack channel. We see the changes to JIRA. We know what each person is doing, but having the call actually gives us a chance to speak to another human so you're not working in your basement all day without any human contact.


We actually have that once a week, if you will, forced human voice contact.

Did that evolve organically? Or was that something planned?


The team was already doing weekly status calls. When I started, I said, "We're going to keep doing that. We're not going to switch out for just, you know, a status email or anything." Before I started, I think they were doing a group edit on a status Web page or something. I don't know if they had calls, but today I mandate the call because I want the team to get together. We've also been doing the group get-togethers at ApacheCon. We got together at ApacheCon Miami, and then the next year in Montreal. Last year we skipped the whole conference format and just got together as a team in New Orleans for four nights.

It was great because it was just us without the distractions of the conference. The conference is good because the INFRA team gets to meet the people that are their users, their customers, the people that we're actually trying to support, all those communities. And the people in the communities get to meet the team. You know, the people that asked, "Can you help me with X?" They get to put a face to those names.

There are times where one of the guys on the team will work with somebody in the community for a couple of weeks to track down some problem, get a virtual machine configured, whatever. All you see is a user ID and the kind of tone of their messages, but at the conference, you can actually put a face to that name, to that ID. That’s really good from a team standpoint. With the team bonding, we spent eight hours a day in this giant penthouse suite in New Orleans on the 30th floor looking out over the Mississippi River. It was very cool, it had space and a big dining table where we could all come in and work. And then I would go around the corner to Mothers and pick up—


Oh my gawd: the po boys ...the debris po boys.

Exactly, you know what I'm talking about.


I lived there. So, yes, I know.

It was literally a block away. So that was our lunch. Every day I was going down to the Mothers, getting a big brown shopping bag full of food and bringing it to the room. We did go there and eat once so the guys could get out of the room for lunch, and each evening we would go as a team out for dinner. After dinner, it's like, “OK, do whatever you want. It's New Orleans.” That was a really good team experience. We were set to go to Nashville this year and then, you know, pandemic ensued. So we called it off.


It's funny: I stumbled across your channel on Slack and, if I remember this correctly, someone was talking about grilling a whole steer or something along those lines. You guys deal with a lot of beef, there’s a lot of meat in this group. So ...

In the team channel, there's a lot of stories about food and beer and other forms of alcohol. We eventually created a cooking channel on Slack because there's other people like Ruth (ASF Executive Vice President Ruth Suehle) and Shane (ASF Vice Chair Shane Curcuru) and others who also like talking about making food. We still have a lot of that discussion on the team channel, but we’ve now got a dedicated channel with a larger set of people talking foodie type of stuff, so that’s very cool.

You were also talking about motivation: I work with each of the guys to find out what they're interested in exploring. Whether it's a new tool or a new product or to write a new tool to improve our workflow, it's like, "What are you interested in? Okay, take point on that, do the research, go do the experimenting." So each of the guys has gotten generally one or two long-term projects that interest them that they want to work on.


[END OF PART ONE]