The "Inside Infra" series with members of the ASF Infrastructure team continues with Part II of the interview with Daniel Gruno, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.
It's a mix. It's a mix of, first of all, the Infrastructure team is paid by The Apache Software Foundation and it's paid by The Apache Software Foundation to help the projects. So what we do must first and foremost be something that helps the projects and not something that just helps Infra.
I mean, of course, we can make tools and have services that will assist us in our work, but the ultimate goal must be supporting the projects. First and foremost, we listen for projects that come and tell us, "We would really like this or we would really like that." Having said that, we do not always say yes. We have costs to consider. We have maintainability to consider. So as a general rule of thumb we will say, "Okay, project A wants to use service foo. Does anyone else want to use service foo right now?"
On occasion, you get, "Nope. No one else wants to use service foo." And then we go back to project A and say, "It doesn't seem like this is feasible for us economically to maintain if it's just you." But you can also have a situation where 10 projects suddenly say, "Yep, we really, really want to use this."
Once you have a trend for something, we are usually not proactive, but reactive to these trends. So a project will come and tell us, "We really want you to use this." We will go out and see if anyone else wants to use this, and they will say, "Yes, please." That's when we'll add that feature or service.
We also have ideas of our own that are, by and large, a result of either existing services not doing what they're supposed to, or they're being... Let's say you have... For example, there is Google and there are mail archives that we had in the olden days. At some point we wondered, "Why don't we combine it so you can search for emails in the archive?" That's how lists.apache.org came to be.
So we have both things that projects come and say, "We really want this," and we also have this crystal ball where we look at problems we're having with existing services, where we look at possible combinations between existing services and other existing services or new services that are emerging in the Web. Or we just have someone say, "Hey, wouldn't it be wonderful if something like this existed?" So it's really a mix of projects asking us and trends emerging and just blue skying, "Wouldn't it be cool if...?"
Have you guys been in the situation where you found yourselves caught where there was this magical trend that everyone wanted, and it just didn't serve the Foundation, it failed? Were you guys in that situation where you had to back pedal? Or is that not part of your experience?
I would say the most prominent or obvious feature or service would probably be GitHub where we started in 2010 with mirrors of our local Subversion and Git repositories. They would be mirrored to GitHub. That was actually a bit later, but around that time, they started mirroring stuff to get up, but you couldn't write to GitHub.
We were adamantly against it. Because provenance, provenance, provenance: that is that thing that if you know Apache, you know that provenance is one of our key features. We like to be able to say, "Oh this came from that. This came from this. This came from that."
We had concerns at Infra that we were not able to have the exact --emphasis on exact-- same provenance as we had on our own servers, and we got a lot of pushback for that. In the end, we figured that maybe we don't need this kind of providence that we had. Because we had very verbose logging going on for our own service that we couldn't get from GitHub because GitHub is a third party provider. They're not going to fork over sensitive data about their customers to us.
So a) we were willing, at some point, to compromise, because it turned out that the data that we had been collecting was maybe not so important after all, and b) we came up with this linking utility that would actually allow us to go in and see who that person committing was on the ASF side. That is, if someone commits with a GitHub account, we can go in and see, "Oh, this is actually this specific ASF Committer," because we have this internal mapping going on with GitBox. And so with that, and then the realization that we didn't need all of this verbose logging, we finally decided that we're going to allow write access, but that was probably... It could have happened a year earlier. A year sooner.
But I wouldn't say that it's a failing, of us, as Infrastructure. I think it's more it speaks of how tenaciously the Foundation guards its core values, one of which really is provenance, because it's the Apache seal of approval, means this has been thoroughly vetted. We know where every single piece of code comes from. And we know that it works.
If you're suddenly letting go, even if it's not really the case, but if you're seemingly letting go of some of those core values, you are going to get pushback because we all, I want to say, love and cherish the Foundation. We all believe so powerfully in its mission that for a moment, we forget reason sometimes and we just push these core values without interpreting them, which is sometimes the right thing to do.
If we have a core value that says, "We need to be able to see where the code comes from." That doesn't necessarily mean we need these five specific points of data from every single user. It just means we need to know where the code comes from. And if that means these four we know, plus this one new one, then that's just as good. That was a bit grandiose, sorry.
No, no, no. There's a lot to it. And I love the angle that you're providing with your answers. That's very different from the other guys' perspectives and that's super helpful. It's important, because that's demonstrative of the diversity of the Foundation. We're people, we're not just machines. And so it's very cool to hear this.
Moving on specifically with our growth, like how do you close your skills gap? Do you do that? Do you rely on the team? How do you cope with stuff that way?
Oh that's a good question. I rely on mentors that I have. I'm not a bookworm, for example. I can't sit to read a book. I can barely watch a movie because I have a very low attention span. So what I'll do is I'll make some mistakes and I'll have some mentors that I have come in and tell me, "You're doing it wrong."
And then I'll fix it or try to fix it and they'll say, "You're doing it wrong again." Eventually, well in a loving way, eventually they'll say that this is right. I love to learn by example. I have a lot of mirror neurons in my brain. I like to copy styles. I like to mishmash styles together. And I love to fall in love with new ways of --this is going to sound very nerdy-- I love to fall in love with new styles of programming.
I recently discovered something called MyPy, which is a typing checker for Python. At first I was like, "Oh this is boring," and then I realized, "Oh, I can actually use this for checking whether what I write is going to always work."
Then it changes into, "This is awesome. I love it." Which then changes into, "Everything I've ever written must now be written using this typing hint." And then suddenly I have Greg Stein yelling at me saying, "Yes, this is technically valid, but I really don't need this."
Another mentor I have introduced me to this typing hint. And so I progress by observing other people doing their things and where they and I differ, there are basically two scenarios. Either they're worse than you are or they're better than you are.
If they're better, or perceived better, I will usually try to study, "What are they doing that's different from what I am doing?" and if I like it, it tends to stick like a rash. Then suddenly, it's in everything I do, because I'm not a trained programmer. I never studied programming. I never studied computer science. I studied social economics and then human resource management which is very far from it.
It was always just a hobby thing so I never really learned about unit testing. I never really learned about unit testing. I never really did learn about proper documentation outlines. And I never really learned this is the correct way to program in this language. This is how you style it. It was always just looking at some examples and then picking the parts that I thought was interesting. So what I initially want to start off as a program, what I wrote it, it would work, but it would be very ugly and it would be very error prone. So people would say, this is a cool piece of software, but it's very not pretty. This is what you should do to change it.
So I've relied on people not telling me that I am good or bad, but telling me, this is the difference between what you do and what I do, and then having it be up to me to figure out is this something I want to adopt.
Greg, for instance, has been a tremendous help in that Python department, not necessarily by saying, “you need to do this, you need to do that,” but by writing some examples. Commenting code on saying, “this could be” --emphasis on “could be”-- “could be this. Or you could use this instead.” Because he's got decades of experience in Python programming, for him to say there's a different, smarter way of doing this, it's not by using words, but by just showing the examples. Because he knows that I pick up on the why pretty easily. I just need to know that the difference exists, then I'll know the “why” eventually, because I'll be very interested in why that difference is. So he just kind of feeds me these little nuggets of this smarter way of doing it. I learned from that and I'm very grateful for that.
Tell me how has ASF Infra changed over the years: is it proactive? Reactive? How and why did this come about? Obviously it's changed, but was it an organic thing? What's your take on that?
It's changed in a lot of different ways and also it hasn't changed. And also: I don't know. It's changed in that it has become more of an obvious hierarchy now, which is not a bad thing. We have a place where the buck stops now: we have a place where decisions are being made. We have, most importantly for someone like me as a staffer, I have someone that I can defer to that I know will take care of it or will be the one with the final responsibility. That can shield us lowly peons when someone is being a bit too grumpy.
That has changed which also means that we, the staffers, are not as abrasive as we used to be. I remember when I joined, the tone was a lot different. This is of course my perception as being this little timid newbie back then, but it was more, every single person had to kind of fend for themselves. Now we've got more of a cohesion. We have yearly gatherings, face-to-face gatherings. We talk about a lot of non-work related items. We have weekly calls that didn't happen before. I guess you can say it's become more of a family now than before because we interact with each other on so many different levels that are not specifically work-related now. It's also made us more friendly.
The change was largely planned. Or it wasn't “planned”, but was planned as a reaction to events that happened --sometimes you come across some things when you're in any given company. We were like, “we need a change”. And this was one of changes that happened a few years back. Well quite a few years back. Actually, I think this was in Cambridge, not Cambridge, Massachusetts, but Cambridge in England. We had a meetup with our new, at that point, our new Vice President of Infrastructure, David Nalley, and the existing infrastructure team. This was the first event in my lifetime, if you will, of the team. The first face-to-face meeting we had, that was all about “what are we going to do in the future as a team”, where we worked out a lot of policies and work methods that we still use to this day. I'll not go into too much detail about why, but it was planned as a reaction to us being perceived as not the most welcoming group of people.
If you go back 10 years, it was in my personal experience, a lot more daunting asking Infra for something.
Do you think that's because people were just rude? Or was it a matter of them being overwhelmed? Or there was no process? What do you think was behind that?
I think there was not a sense of structure in the team that we have today. People were self-led. We are, let me emphasize that, we are still very much self managing in the team, but we also have a boss and a boss's boss that let us know what they would like us to focus on for the long term processes. We didn't really have that before. It was more fend-for-yourself, figure-out-something-to-do. And if you can't, then that's just “why not”?
We have a lot more structure after the Cambridge meeting. And after David started as VP Infra. Because we had gone from being --I don't know if you know this saying in the US, but there is some difference between a United States and American NATO Secretary General, and a Dutch NATO Secretary General. And that is that one is a secretary and one is a leader. One is a boss and one is a leader. We had a change in the style of management at that point.
It's not that (former ASF President and VP Infrastructure) Sam (Ruby) wasn't doing his job. It's that David added something to the job that wasn't there, hadn't been there before. Sam was doing what every single VP before him had been doing. Which was fine. David came in and saw that there were things that he wanted to improve upon and he improved upon it. One of the outcomes was that, in my view, that the team also became more friendly towards people coming in with issues. But it's also a different environment that we are in now as a team. Apache in the old days, it was strictly volunteers spending their hobby time, doing what they love.
It has slowly pivoted into being people that are paid. They still contribute as individuals, but they are being paid to make those contributions. They are also part of larger teams, often at big companies that have a lot of resources. The expectations and demands of the Apache infrastructure has also increased exponentially as we have become a large organization. So what we are tasked with today is also more demanding. I don't think that the infrastructure to staff 10 years ago would have the same interactions and the same good terms. You want to be on the same good terms with the contributors as we are today. So in that sense, I think David was gearing us up for what was to come.
David has also a unique perspective because he had come on Board in 2012 as part of the Apache CloudStack project. So he came in as an incoming project that also needed support from infrastructure. So he has experience on both sides of the fence, so to speak. You know, Sam has a much “older” experience in terms of him being with the Foundation from a much earlier time period. So it's very interesting to see how the evolution has come about. A lot of us who've been here from the beginning see things a certain way, and don't realize that from an outsider's perspective, that experience might be completely different. It's very interesting to be able to have that balance and have someone come in and kind of make the team more cohesive based on what their perceived needs were and being able to project what projects will be needing in future.
It really is. Yeah. Also, he has a very special way of --let's say he's very “godfather”-like. I don't mean that he kills people! He has a very persuasive non-intrusive way of asking you for a favor that a) I find very endearing, and b) I know why he using it: because it's very effective. That I don't think a lot of people would get away with. So what that means is we do a lot of things that David asks us because it's David, because he's built up goodwill. It's easier for him to shape the team and to what he wants it to be as to someone that was just there as a secretary and didn't really do anything. If you're not engaging with the team as a boss, and then you suddenly come in and say, do this, there will be pushback. But if you're engaging, if you're there, if you're have a presence in the daily routines and the daily water cooler chat, and then you say, "Hey, by the way, what do you think about this idea?" Then you're much more likely to get a positive response back. I think that's one of the things that David brought is a more relatable and more ... let's say he's brought in a closer bond between boss and workers. Leader and workers. And now we have Greg as well. So now we have two of them.
That’s progress in the right direction. What areas are you, meaning Infra, experiencing your biggest growth?
At the moment, that would be continuous integration, which is building software basically. Testing that something builds. Testing that something compiles properly, that it passes these tests. We have six or seven different platforms for that at the moment, and it is using hundreds of machines. And it's never enough! We know we have a demand and we know what the trends are, and we're also kind of blue-skying a bit on how do we solve what's ahead of us. A lot of this is throwing more money at it because that always helps. A lot of it is, again, going back to developing smarter tools that enable us to utilize the resources that we have, because we are not like a big whale. We don't have a cash whale: we don't have that much money.
So we’ve got to make sure that the resources that we buy or lease or rent or whatever, are being utilized to their maximum potential. So that, again, comes back to figuring out how do we go in and monitor. Is it being utilized? What can we do if it's not? What do we do with over utilized? Can we figure out where it is bottlenecking? And a lot of other things.
Builds, continuous integration, continuous delivery, I think it's called. That's the place where it's the most growth at the moment.
With regard to CI, what is the most popular platform that you guys are using or what service has the biggest demand?
The most used one is still Jenkins. I think we have 30-40 Travis machines building there, and that's practically nonstop. With Jenkins, we have, I think it's 150-200 machines or something that are building practically nonstop. That's by far the largest platform we have. We are using a lot of Travis and Buildbot. We can always use more of that.
You’ll be talking to Gavin (ASF Infrastructure team member Gavin McDonald), who has been working a lot on splitting Jenkins into smaller components. So that major software categories, for example, get their own platform and bigger projects can get their own platform. This is because we don't want a monolith. We're splitting that up to actually save us some costs and not have so much downtime on the time. He can tell you much more about that.
One of the things we did was graph out how much are we actually using and how much have we been using. Which projects are using the most of these resources? And if there's a specific project that sticks out like a sore thumb with, I don't know, 50% of all the computers or the machines are going towards that project, then we'll reach out to them and say, are you maybe doing something that's a bit too intensive? Can we scale this back a bit? Or do we need to look for a specific targeted sponsor for you or what? We're not constantly, but on more occasions than not are looking at these resource usages and seeing where can we optimize things so as to not use too much money and also not use too little money. Just the right amount.
So many companies, as you know, are really struggling with their teams working from home in response to COVID lockdowns and stay at home orders. From day one, ASF has been virtual. I understand that you were stuck in another country when the pandemic lockdowns happened. How did you cope with that? Did anything change with your operations, your work? How were you impacted by that?
Work-wise I was not impacted at all, which is wonderful. We are able to work from pretty much wherever we are. And this was not my first trip abroad, believe it or not. This was in Canada, by the way, I was stuck for 105 days. In the few places that I go to more than just once, I have it set up so that I can work from there in a reliable and comfortable way. By that I mean, I don't like laptops: they're a wonderful invention, but I don't like them. I don't like sitting hunched over a tiny, tiny keyboard without a mouse and looking down instead of straight ahead at the screen.
Luckily I have a laptop and I travel with it all the time, but I plug it into a KVM switch which is a keyboard, video, and mouse adaptor. I have a monitor and a good old sturdy keyboard set up in the places where I frequent often. So I was able to work from there as I would with my stationary machine back home, just using my travel laptop. The only difference was the time zone difference. But we do most of our work, asynchronously. And whatever firefighting there is that always just happens at random hours. So it doesn't really matter what time zone you're in. You're going to be screwed one way or the other
What do you think people would be surprised to learn about ASF infra?
Surprised? I mean, probably that it's only six people. I'm sure, I remember Drew saying this and Chris and so on, but people are often surprised that it's only those five-slash-six people that are doing all the work.
I know you all, and I'm astonished by it. I'm perpetually amazed by that. It is a seriously huge feat.
You want to know what surprises me from the inside? That we actually manage it. It surprises us as well. It's not that “oh yeah we're just that great”. There is something about the coalition and the project that we can't really explain, and it's not explained by the individual parts. It really is the sum of the whole, that somehow…
Huge, huge, huge undertaking. It's massive. And the fact that you guys do it is incredible. And yeah, you know it would take five, six times the number of people to do it elsewhere. So it's very special.
I think we also have a lot more on flexibility from up above. From both our boss and his boss and his bosses. They trust that we know what we're doing in a sense that you might not find at a typical company. And I think that is part of the reason why we're able to do the things that we do so efficiently. Because we've been given this trust and we've been giving the benefit of a doubt if you will, when we choose to... They trust that we know how to manage our hours and get things done. Like it's not a strict requirement that you be here, nine to five, Monday through Friday. You can be here, I don't know, three hours one day, and then 12 hours next day. Or maybe you want to work on Sunday instead. As long as the job gets done and nothing falls through the cracks, they basically let us get our job done. And I, again, I think this is a win-win situation because it allows for us to kind of cool down when we've burned out a bit, but it also gives them the added benefit of when I feel like I will put in the extra hours because kind of as a “thank you for you let me decide my hours”. So I'm going to put in some more time here and then I'll relax when it's quiet. Because we do get quiet days.
So you all have to carry the load, which is good. There's no favoritism. Everyone has the same shared responsibility --you all have to be on call, for example.
Yeah. It's still quite a flat structure. I don't consider myself senior in a managerial wage to any of my coworkers. And so if I were to, or if someone else, if Gavin or Chris or Chris or Drew. If anyone were to say, "I'm not going to be on call," that would create a rift between us.
I mean, there are staff members who wish they didn't have to participate in it, but we all are on call on a rotating basis. And so I think we're fortunate that we're all in a position where we're okay with it.
We were able to manage it because there are legit situations where someone is not able to be on call. I think we all have them from time to time or someone else has had to cover our shift so to speak. All five of us are fortunate that we don't have things going on in our lives where you can't be on call, because those things, they can happen.
Sure, that makes sense. So what's your favorite part of the job?
This is going to sound weird for a lot of people, but my favorite part is the weekly meetings.
Why does it sound weird? People aren't social?
I don't know. It might sound weird to normal people who don't like meetings, that I liked meetings. There's something about meetings... Even though they are very informal meetings, I like the little shred of formality that there are about them. And so that's, I think, my favorite part. And also I get to prepare for them.
All right. So you must like preparing for the Board meetings too.
Yes. You should read my Apache Web Server reports.
What was your biggest challenge when you came into the role at Infra?
There were two major challenges. The first one was learning the ropes. This is, as both I have said and a lot of people before me, it's such a complex system at ASF. There are so many things to know and it doesn't just take a year: it takes years to learn enough to get by without someone else's help. So that was, by far, the biggest ... Well, no, that was the second biggest challenge. The biggest challenge was believing in myself and not being scared of doing things unsupervised. Because again, what I can do and what my other colleagues can do with their keyboards is very, very ... We wield a lot of power over a Foundation that is responsible for so much in the world. Not being afraid of typing a command takes a long time when you know what a title can do on a machine.
You know, “did you just delete this file or did you delete the entire hard drive”? And especially at the very beginning of getting into the job, I would double, triple, quadruple check everything I typed. I would wait for sometimes minutes before I hit enter just to be sure, I would look up that am I doing the right thing. Just to be sure that I'm not messing things up now.
And as you to do it once, twice, three times, 10 times, a 100 times, you become more confident and you also relax more. Your first thought isn't “what if this goes wrong?” First thought is “let's see what happens next”. Or you're thinking ahead to the next debugging step or the next problem solving step instead of being stuck on what if this goes wrong? Which also means unless something goes wrong, you work a lot more efficiently. Because you're not fearing the Enter button.
What are you most proud of in your Infra career to date?
Let's see. I'm well, probably most proud of ... That's a very difficult question.
That's why I ask it. If you don't want to answer, that’s okay.
Oh, no, no, no. It's just that I've made so gosh, darn many things. What I'm most proud of is probably ... I would say that lists.apache.org is a thing that's playing out of an Infra job I was doing that. Yeah. I'd say that's probably the thing I'm most proud of.
lists.apache.org is very powerful. We all use it every day. So that's yours?
With the help of a few friends, yes. It was a brainchild of mine during some tests we had at Infra. And again, it's one of those situations where you have something that's not working and you're like, "Maybe I'll just rewrite it completely and it'll work. And then you start writing and then you find a good idea, a better way of doing some things and some things don't work. And then sometimes eventually you end up with a product that sticks.
It's one of the most long lived projects I've had and that's still used today.
Well, it's super useful. There's no doubting its efficacy and necessity. I mean, how many mailing lists do we have now? 1,700? It's some crazy number.
I think we're nearing 2,500 if you count the private lists.
And that's like 25 million emails, so ...
That's insane. That's very cool. Very cool. All right, next question. This is the one that everyone kind of laughs at. How would your coworkers describe you?
I'll have to think about that.
They would probably describe me as the one that suddenly says something completely out of context.
I thought I was supposed to be laughing, not you.
That is funny. What happens is when I asked the question, Chris, Drew, and Greg, all laughed. Then they give me their answer and I always laugh. So it's classic.
Tell me what you think are the biggest threats that infrastructure teams need to watch out for?
If you stick with what might be good and try it and test it and it's stable, you might end up losing everyone because while it might be that, it might also not be what people are using tomorrow. If it's not what people are using today and tomorrow, then no matter how good it is, people are going to move away from it.
Not necessarily outdated in the sense of technology, but more in the sense of trends.
What is trendy. Yeah. I mean, it used to be Vine. Now it's TikTok and tomorrow it's going to be something else.
If you don't keep up with the fashion of IT, then you're going to find yourself not wanted.
That timing out happens more quickly these days, it seems. Okay, what would be advice to aspiring sysadmins or Infrastructure team members?
My greatest piece of advice is basically don't be afraid because this ties back into the daunting task of having to push the Enter button after you type something in the command line. Don't be afraid because you'll lose so much time just being afraid that you could have spent fixing things or learning new things or making yourself more at ease. Just jump in with both feet and you'll be fine: you're awesome. Yeah, that's good advice.
If you had a magic wand, what would you see happen with ASF Infra?
Oh, interesting. I would like to see us having some magic, unified CI system that could be used across the different repository and types we had and didn't require any machines that would just build instantaneously. And yeah, be free of us needing to manage yet and pay for it. And also, if GitBox version two was suddenly a thing tomorrow and I didn't have to actually write it, which I still have to do.
Okay. What else do we need to know that I have not asked yet?
Gosh, I don't know. I don't know. One thing I'm really good at or one thing I'm really bad at is when you ask me an open question like that, because I don't know where to go with that. I am very good at analyzing a question and coming up with a specific response, which is why when people say, "How are you doing?" I get confused or I say, "I'm okay." And get nervous and forget to ask them how they are doing, because I don't get the dynamics that are happening there."
The reason why I ask this question is sometimes people come in thinking, "Okay, this is my area of focus." They might want to talk about the “blue switch” or something specific like that, then we talk about all sorts of other things. We may derail. I may be driving the interview in a certain direction, and they have this pain in their gut because they never got to talk about the blue switch that they wanted to.
The only thing I could think of would be something called pip-service, which is a new thing we're making, which is kind of like a package manager for all of our infrastructure services. Again, it's us working smarter instead of harder. And we were defining a way to easily install or run a service on any given machines and set them up without actually having to install and run then set them up. It would require a lot more time to explain in detail, but it's really nifty.
Is it coming soon or is it available now?
It's in production. And it's really helped us a lot.
I love the efficiency of Infra, how you guys are having these new directions ... Like when you and I were dealing with the selfserve.apache.org the other day for the CMS (content management system), when I was getting the Annual Report up. For 21 years, I haven't been able to deal with the ASF CMS and then you walked me through it in literally three minutes on Slack and boom: it was done. I was amazed and shocked --because I'm not a technologist. To me that was phenomenal. You guys are really helping so many different kinds of people with different profiles and different skill sets. It's very cool.
I think some of that ties into, again, the CMS was cool 10 years or 15 years ago, but it's not really been able to keep up with what's going on at the moment. No one knows how to use it because it's not very intuitive… Or it's not what we do today.
Right. As we’re halfway through the Infra team, who do you think I should be interviewing next?
I think you should be interviewing Gavin because he knows a lot about the CI platforms that I have been on, off raving about here.
Gavin's not planning to talk to me until October...
Oh, well then you should talk to Chris Lambertus, because he doesn't want to talk to anyone. (laughing) Chris can talk a lot about the upgrade of our email infrastructure. We have a lot of very tough work ahead of us in that we're upgrading an infrastructure that again, it works, but it's kind of like upgrading from an IBM mainframe to a modern computer: not that much of a upgrade, but we are having to modernize heavily on our Infra email infrastructure.
I understand that's a huge, huge project.
It's a very big project, yeah. That's a little advice for sysadmins there.
= = =