Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list.
Summary
As technology professionals, we need to make sure that the software we write is reliably bug free and the best way to do that is with a continuous integration and continuous deployment pipeline. This week we spoke with Pierre Tardy about Buildbot, which is a Python framework for building and maintaining CI/CD workflows to keep our software projects on track.
Brief Introduction
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show, subscribe, join our newsletter, check out the show notes, and get in touch you can visit our site at pythonpodcast.com
- Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project
- We are also sponsored by Rollbar this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
- Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
- Your hosts as usual are Tobias Macey and Chris Patti
- Today we are interviewing Pierre Tardy about the Buildbot continuous integration system.
Interview with Pierre Tardy
- Introductions
- How did you get introduced to Python? – Chris
- For anyone who isn’t familiar with it can you explain what Buildbot is? – Tobias
- What was the original inspiration for creating the project? – Tobias
- How did you get involved in the project? – Tobias
- Can you describe the internal architecture of Buildbot and outline how a typical workflow would look? – Tobias
- There are a number of packages out on PyPI for doing subprocess invocation and control, in addition to the functions in the standard library. Which does buildbot use and why? – Chris
- What makes Buildbot stand out from other CI/CD options that are available today? – Tobias
- Scaling a large CI/CD system can become a challenge. What are some of the limiting factors in the Buildbot architecture and in what ways have you seen people work to overcome them? – Tobias
- Are there any design or architecture choices that you would change in the project if you were to start it over? – Tobias
- If you were starting from scratch on implementing buildbot today, would you still use Python? Why? – Chris
- What are some of the most difficult challenges that have been faced in the creation and evolution of the project? – Tobias
- What are some of the most notable uses of Buildbot and how do they uniquely leverage the capabilities of the framework? – Tobias
- What are some of the biggest challenges that people face when beginning to implement Buildbot in their architecture? – Tobias
- Does buildbot support the use of docker or public clouds as a part of the build process? – Chris
- I know that the execution engine for Buildbot is written in Twisted. What benefits does that provide and how has that influenced any efforts for providing Python 3 support? – Tobias
- Does buildbot support build parallelization at all? For instance splitting one very long test run up into 3 instances each running a section of tests to cut build time? – Chris
- What are some of the most requested features for the project and are there any that would be unreasonably difficult to implement due to the current design of the project? – Tobias
- Does buildbot offer a plugin system like Jenkins does, or is there some other approach it uses for custom extensions to the base buildbot functionality? – Chris
- Managing a reliable build pipeline can be operationally challenging. What are some of the thorniest problems for Buildbot in this regard and what are some of the mechanisms that are built in to simplify the operational characteristics? – Tobias
- What were some of the challenges around supporting slaves running on platforms with very different environmental characteristics like Microsoft Windows? – Chris
- What is on the roadmap for Buildbot? – Tobias
Keep In Touch
Picks
- Tobias
- Chris
Links
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast.init, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show, subscribe, join our newsletter, check out the show notes, and get in touch. You can visit our site at python podcast.com. Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project. We are also sponsored by Rollbar this week. Rollbar is a service for tracking and aggregating your application errors so that you can find and fix the bugs in your application before your users notice they exist.
Use the link rollbar.com/podcastinit to get 90 days and 300, 000 errors tracked for free on their Bootstrap plan. You can can also join our community. Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas. Your hosts, as usual, are Tobias Massey and Chris Patti. Today, we are interviewing Pierre Tardy about the Buildbot continuous integration system.
[00:01:12] Unknown:
Yes. Hello, everybody. I'm Pierre Tardy. I'm the current maintainer of the Buildbot project. I've been involved in this project, since, 2009, and the former maintainer for this project was, Dustin Mitchell who I'm replacing and I would like to thank you for forwarding this invitation to me. And so my area of expertise is mainly embedded. At the beginning, I worked for Freescale for 3 years, doing hardware software integration. And and then I was hired by Intel, and now I'm working on the continuous integration team at Intel. And I moved more into a tooling, teams.
So working on on Python tools, for for the CI teams here.
[00:02:24] Unknown:
So how did you get into Python?
[00:02:27] Unknown:
So, I I got into Python, during my work at Intel. We we were working on the hardware project with FPGA and I was leading the firmware on that project and also the testing. And we had a colleague who was doing the system and working mainly on MATLAB. And at that time MATLAB was pretty much a proprietary system. It was a little bit hard for him to look at the community and the language was not so good for what he wanted. So he looked at Python and then we started to look at this language and we basically chose it for all the test suites that we had to work for this, hardware software project.
[00:03:22] Unknown:
So for anybody who isn't familiar with it, can you explain what Buildbot is?
[00:03:29] Unknown:
So basically, Buildbot is a CI framework, and it's not a UI where you can just install it and then you point to your git tree and everything is building, you have to do a little bit more work before starting but then you've got a very important framework. And this framework allows you to build something that you really need, something that you really want to do compared to other CI tools, which are, which have their own workflow. And then when you go out of what is already prepared inside the plugins and inside all those stuff, then it's difficult. So BeBot is really a framework for that.
It can do continuous integration if you need, it can do continuous deployment, as you wish and release management anything that you need to implement. It has, like Python, a battery included, so it has a lot of already made steps in order to do git checkout, in order to integrate with, Gerrit or GitHub, or a lot of tools that are used by our community. But again, you can also easily write new integration for any proprietary tools that you have in your team or anything. It's based on TwistIt, and and then you you can use all the TwistIn infrastructure to really connect to anything.
[00:05:23] Unknown:
And what was the original inspiration for creating the project?
[00:05:27] Unknown:
So this is actually a good question. Bilbot is quite, old project. It started, I think, somewhere in 2, 005 or something like this and a long time actually before I even knew about Python and Continuous Integration so it was already written by Brian Warner. And actually Brian left the project before I was ever involved in it. So I never actually had any contact with the original writer of Billboards. So I cannot too much answer this question, what is an inspiration? I think just Brian needed at some point to do some continuous integration and at that time no tools were really there. And then he started to write his stuff and he just saw this fine twisted framework that existed and it started to connect bill slaves to a bill master and started to work on that.
And then he published this project through the Internet and the community started growing and we eventually together build a framework based on that, because Python is so good at building frameworks and building communities.
[00:06:58] Unknown:
And can you give a bit of history as to how you got involved in the project and what inspired you to keep working with it?
[00:07:06] Unknown:
So so that was back in 2009, and I was just hired at, Intel. And I wanted to work on Android at that time so we started this project starting to put Android on our platforms And then, you know, with Android you need to work with Gerrit. There are tons of projects, Git repositories that are integrated by the Gerrit build system. And then we just integrated everything manually via Gerrit, and it was quite difficult to actually test, the patches that were contribute contributed by other people. So I started to look at how I could automate my job a little bit more. So I looked at the CI systems and I saw Jenkins, I saw Billboards, but then neither of them has any support for Gerrit and Repo and the Android ecosystem.
So the thing is, I was more confident with Python, so I just chose bbud because, if ever I had to write support for, Gerrit, I prefer to write in Python. And eventually this was a good choice because Bigbot was actually made in order that it's easy to add new reports for new tools. And then after that I was working on another project, I left a little bit Android for a time and I just contributed to the Upstream project, my repo and garage support. And about 1 year, 8 months later, we just go back on that Android project, and I found out that some other people have took my work on Gerrit and the support for Gerrit had improved a lot in billboard, so I took back this and started again my CI tools. I showed that to the management and they lacked it. So we started to build a CI team inside our organization and this grew then more and more. And so I'm known I'm now managing my my CI team here based on Bilbot.
[00:09:43] Unknown:
Can you Gilbert. Can you explain for our readers who might not know what Garrett is?
[00:09:48] Unknown:
So Garrett is, that's a good question, it's a review system. It's actually a review system and a git hosting system. Like I said, when Google started its Android project they had to integrate maybe 200 of different software combinations in an embedded context. So they had to change, those projects. They had to make some patches on it, and they had to be very quick. So they choose to make integration first design where all the projects are available in the work environment. So when you work on Android, you have all the Git projects that will be eventually put in your phone or tablet.
And, so they also needed a review system, to, to host all those contributions. And so they so that's why they started to work on on this Gerrit, which is hosting all those Git projects, and then providing a workflow where developers can upload their patch on Gerrit and then their reviews, and then people can say okay, I'm good with it, so I put color review plus 1, and then, it can be integrated with a CI system which then puts, Verify plus 1. And when everything is okay then the patch can be merged and then is available in the mainline. So basically Gerrit has similar functionality as GitHub or GitLab or any other reviews and hosting and reviews tool. But the workflow is quite different. It's a per patch workflow.
When you upload a patch, then you have a review, then the workflow is to amend your patch and then upload another patch. It's not like with GitHub or GitLab where you upload a branch and then when you have reviewed, then you do new commits on your branch and you upload that new branch. So it's really you have 1 patch and then several versions of the same patches inside the review system. And Gerrit had a lot of good UI in order to manage those different versions of
[00:12:37] Unknown:
BatchStep. Thank you.
[00:12:39] Unknown:
So can you describe the internal architecture of Buildbot and outline how a typical workflow would look?
[00:12:46] Unknown:
Yeah. So Buildbot is basically a job scheduling system, very generic, so it consists of a master and several bill workers. Basically, the functionality of the master is to send commands to workers and, gathers the log outputs, and it puts the logs into a database and puts the results, the commands, in in a database also. And then, for triggering the jobs, buildbot master needs to figure out when to trigger the jobs. So there is this notion of change source, which are components that are inside the master and, looked for repositories like subversion, deets, even CVS is supported, to change.
So, there are basically several options either you have the poll that will just all the time poll the CVS or the Git, and then if it figures out there is a change then we can send events, or you can have hooks inside your subversions that goes to the masters, and then, do do the actions as needed. And then, there when builds are finished, you need to notify your users. So there are several possibilities: you can send email, you can have a web page where users can look at the status, and then you also have and various other status clients.
So that's basically how it works, and eventually there are other complications. You can also, for scalability, also have several masters, which is something that is possible, and you can also have on demand workers. There are there are lots of features beyond this simple description of the architecture.
[00:15:13] Unknown:
So you mentioned that it's possible to have multiple masters, I'm assuming, for a high availability configuration?
[00:15:21] Unknown:
Yes. The masters need to talk to each other, so there is a message queue capability and we have an internal message queue abstraction. And then, you you we have an internal, very simple message queue that is master to master for a single master and just an intra process. And we we can also rely on external message queues. And for the moment the only message queue that is supported is crossbar. Io and we we want also to to support AMQP, RabbitMQ, that kind of, message queues.
[00:16:07] Unknown:
I've never heard of Crossbar. Io. Interesting.
[00:16:11] Unknown:
So yes, Crossbar. Io is a message queue designed for real time RPC and that kind of stuff. It's written in pure Python and it supports actually crossbar is based on Twisted, while the clients can run either with Twisted and, Async. Io from Python 3. And so basically we choose Crossbar in the first place because it's really async, with, while AMQP and if you want to have support for AMQP and RabbitMQ, you need to do that under threat because the clients' libraries, they are threaded and they are not asynchronous. Maybe we can talk about that later.
[00:17:09] Unknown:
So I just wanna say 1 more thing. It's really interesting to me. You mentioned it that it still has CVS support. I was kinda shocked to see, an announcement from OpenBSD the other day saying that they still use CVS. I I can't imagine with the advent of Git or even subversion why a project would still continue to use CVS. But I guess for some people, it's still meeting their needs. So if it ain't broke, don't fix it. Right?
[00:17:33] Unknown:
Yeah. And it it it really depends on, on people and the projects, and, how much they have tools that you use, that if you have a lot of tools that integrate your release process with CVS, then, it's really such a pain to move to another stuff. We saw that with Python, where, you know, the Python project was in Mercurial and then they switched to Git and this was such a big question for the community, how do we do that move, and so on. And I believe this has taken a lot of time in order to make that hard decision to okay, let's change our process and try to gather more contribution by doing so. Because obviously I think now there are few people that actually want to contribute to an open source project which is hosting CVS.
Right. You know, it's it's so painful.
[00:18:43] Unknown:
Yeah. It definitely requires a paradigm shift if you use to get to sort of deal with CVS's way of doing things.
[00:18:52] Unknown:
So as part of we are concerned, we are we are letting, all our community support, what they need. If there is a CVS community that wants to use build bots and CVS, then it's it's cool. And, we will let them make batches and improve the situation. The thing is now we have some unit tests that are testing CVS, but as far as I know, nobody is actually using it or there is no, people that I'm that I'm connecting with that uses CVS with Billboards, so I don't know if it actually works if you try to use it. But there are unit tests and it's supported. And there are also some other stuff that are fairly old, which we have code, we know that the unit tests are working but we have nobody we know that is using it. Like, we have a male, a male polar which is actually for CVS and other RCS.
CVS has the possibility to send emails whenever there is somebody that changes stuff in CVS, and we have a change source that is watching a mail gear in order to generate a BBOT event when there is such event. And we have these unit tests are working, but we don't know if anybody is using it, and we have no integration tests that are verifying that this actually works.
[00:20:34] Unknown:
So there are a number of packages out on PyPy for doing sub process and vocation and control in addition to the functions in the standard library. Which does Buildbot use and why?
[00:20:46] Unknown:
So Buildbot use, Twisted as a framework library to do all its stuff, and it it also uses Twisted for process control. And the reason for that is the same as for the MQ. Triste is asynchronous and we are looking for libraries that are asynchronous to do our stuff efficiently. Okay? If you are using, libraries that are synchronous, then you need to execute those APIs inside a thread, and, and you are starting to lose all the advantages of of TwistIT. So that's why we use that. It it works great, for all the public systems that we do, but for Windows, we have issues from time to time, especially process termination on Windows is not very stable when we are using Twisted.
And I believe some other process, in vocation and controls, are having their own hacks in order to do retries or have several levels of process termination policy. 1st you try maybe with 1 method and then you try with another method and this is something that we need and the Windows community can help on that. So we we we need to improve on that kind of stuff.
[00:22:23] Unknown:
What makes Buildbot stand out from other continuous integration and continuous delivery options that are
[00:22:29] Unknown:
available today? So that's a very interesting question. And today, in your question is very interesting. So buildbot is a really old project, like I said it was around since more than 10 years and we really saw lately a lot of options on continuous integration especially 1 of those who are hosted like Trevis, Abveo, CircleCI and all those stuff. Those are really great new systems. And as a CI expert, I'm very happy that there are more and more options available for the people. And when I started with Billboards the basic 2 options for open source was really Jenkins and Billboards.
And the thing is it's really interesting because they have a very different approach and Jenkins' approach is really for simple CI requirements. It's very very easy to start with. There are tons of plugins that are existing and then you can just mount your own CI servers in a matter of a few hours I would say. While with buildbot it's a little bit more time consuming to start with. And then, so this is the kind of negative aspects that I already have talking about but then when you do this ramp up, after you have done this ramp up you are really in the framework and you really can implement whatever CI and CD system that you want. It's really the flexibility that is in our core value And also the scalability we have, buildbot is used by huge teams doing hundreds of builds a day and this scales quite well.
And I've heard that Jenkins also had such kind of scalability issues lately. I'm not really sure how they finally managed to resolve them. So I cannot talk too much about Jenkins about that. And then for the hosted CI, when I say in a couple of hours you can build your own Jenkins, with a hosted CI, that's within a couple of minutes. So this is great. But then when you want to do things differently that the framework is allowing you then it's more complicated with Trevy CI it's really in order to implement a workflow that has been designed for Trevis.
And that's great because it's better for me that people use CI than they don't use anything. And then if you are starting with Trevis and if you need more than Trevis maybe I can suggest you to use, to use look at Billboard. And I have seen some colleagues using Jenkins that eventually want to do more and do more and then it is much more complicated for them to evolve their banking solution. And then the last thing with with, with this is with bill bot. All your configuration, how the the description and how your your CI works, it's described in in Python config file. And it can it can be later split into Python modules and you can use the power of the Python language to describe your configuration in a very, scalable way in the fact that you can take the same builders and then duplicate the same in dictionaries or in YAML files, something like this. And then you can put those YAML files into configuration so that you know when you evolve your complex CI system, you know all the time, if somebody changes things, it needs to commit that change to to the Git hosting, the configuration for the CI.
And then you know what change you can debug, why it's failing and you have such a possibility that is really bound into this project. While, if you are using Jenkins, you have all this UI that allows you to change things in a minute but then there is no traceability who did the change, why and so on. So I've heard people, teams using Jenkins that try to scale have such, issues.
[00:27:58] Unknown:
And 1 of the interesting things that a lot of the more modern CI platforms have introduced is the ability to collocate the job configuration with the code that's being tested or built. And I'm wondering, does Buildbot have any built in mechanism to support that, or would you just use the Python module loading capabilities and just embed that Python job description in the repository that's being built?
[00:28:22] Unknown:
So it's really a matter of policy. There are several people can do whatever they want. If they want to do like the workflow of SREVIS, where you put inside your Git repository the way it is supposed to be built, then that's fine. We can do that with Buildbot. If you want to put aside of your project, the way it is built, inside the configuration from Buildbot it's possible also. So really we have no strong opinion on that, you can implement the 2 options like it's better for you.
[00:29:12] Unknown:
And another thing that you mentioned in there is that, when it comes to scaling a CI system that some of the other options, such as Jenkins have historically had some difficulties in doing so. And first off, I'd like to ask, what are some of the motivating factors that would cause somebody to need to be able to scale a CI system and what are the impacts of having a system that is not very scalable?
[00:29:35] Unknown:
It really depends on the project. For what I can tell, in my team we are supporting a lot of you know Android branches, Android projects that can happen within my company and then it's always the same way to build a Android project. But then each teams, each branch will want to have some very specific configuration. They have not the same set of reviewers, they want to use another, you know, a bug tracking system or that kind of stuff. And then we have a Bbot configuration that is very parametrized, that has a lot of parameters. And then we are maintaining all our parameters into YAML files. And with Buildbot, it's very easy. We so this is my experience and I know that they have a lot of there there is a lot of other teams within the billboard community that are also using this kind of method, having a basic process that is implemented within the buildbot framework. And then this process is parameterized by YAML or JSON or .ini, whatever, configuration file that you want. And then when you do a build but reload, when you reload YAML files and build the CI system given that. So you really have this flexibility with our system as the configuration is written in a Python script, then you can use the power of Python to auto generate a bunch of configurations.
So you can do whatever level of higher level configurability than Buildbot is proposing. So Buildbot is proposing your API when you can configure a builder and then inside that builder several steps. And then each step runs whatever commands but then you can build upon this framework a higher level of configuration when, for my example, you have an Android branch and then this Android branch is configured with this and this and this parameters. And then the buildbot, your your high level buildbot API will just build a set of builders that is implementing the process you want to use for for this branch.
[00:32:39] Unknown:
And going back to the topic of scaling, as we've mentioned, scaling a large continuous integration or continuous delivery system can become challenging. And I'm wondering what are some of the limiting factors in the build out architecture and, what ways you've seen people work around them?
[00:32:55] Unknown:
So the main scalability issues you can have with Filbot is what I've seen is the master. So in the typical, billboard environment, you have a single master and then a lot of workers. And you can see, obviously, that the master is the single point of failure in term of failure and also in term of scalability. And also the fact that Twisted is monospherded and Twistid is asynchronous, if on the master you are doing some processing, I mean at the end of steps, you process the logs, figuring out things, or you are doing additional integration with other REST API. You have to really make sure that, the master is not doing too much stuff.
And because if you are doing in the master blocking activity, like generating statistics or whatever blocking activity you want, you may block the whole twisted reactor. And within that 1 second, when the master is doing this hard computation, then buildbot cannot the master cannot do any other thing. So you have to make sure that you don't do any blocking stuff inside the main thread. Or if you are doing other blocking stuff, then you need to put that in another thread. So this adds some complication. So this is something that people that want to scale need to take care of.
And then if at some point you have thousands of workers you want to integrate, then you have to think about Multimasters. And then with the new version of Buildbot Multimaster, it's much more easy. With the Buildbot, 0.8 lines, you know, we had the storage of the results which was a little bit put in the database, in the MySQL database or post SQL database, but there were also lots of stuff that was put in files on the MASTERS and in pickle, you know. Pickle is the the way for buildbot to serialize Python objects. So, historically we have been using Beaker to store everything and then we moved to start some stuff in the database and then with the new buildbot 9 project that is just about to be released, we put everything in the database. So that when everything is in the database, it is now much easier to put several masters because, you can have each master act exactly the same as the other master, while with the previous version of the architecture, you had to give each master a set of responsibilities because the other master didn't have access to the results of stuff that was managed by this master.
So with the new architecture that we have been working on for the last 4 years, this will allow a new level of scalability. And we are very excited about that.
[00:36:41] Unknown:
And are there any design or architecture choices that you would change in the project if you were to start it over?
[00:36:48] Unknown:
Probably if we started over we we would start with asyncio. That that's 1 of the issues that we have is that there is not yet a full asynchronous library support for asynchronous libraries in Python. I would say that 80, 90%, 90, 95% of the Python ecosystem is still written in a synchronous manner. It it was very, astonishing for me to to see that even the message queue clients, which are very asynchronous things, very networky, you know, they are mostly written in a synchronous way. And with Python 3 and asyncio I'm really excited to have a framework that is inside the core of Python that allows people to to see the benefit of asynchronous and so write their stuff asynchronously.
And, you know, with Node. Js, there has been a lot of fluff in the in the ecosystem. Node. Js, you know, they claimed to have invited the asynchronous stuff. Well, you know, with Twisted, you had asynchronous stuff like 10 years ago with Python but at that time there were only a few handful of people that really saw the poor full of the asynchronous. And then after that you, you know, you had a tornado which is also a synchronous system written in Python and libraries written for tornadoes who are not fully compliant with Twisted and then so that's that's the thing with asyncio we really have a common stuff that, hey we should really start new writing libraries for asyncio.
And so that is 1 of the things that, if I would really have to start right now from scratch, we would write in asyncio. And then for other stuff I don't see so much we are quite pretty happy with the design we have. 1 of the flows that we will be working on is concentrating on how we work with the new you know Docker and EC2 Amazon AWS or you know, Azure and all those cloud services, they allow to, start, workers on demand. I want to, not be limited to to configure this 100 or this 200. I want to just to be auto scalable and manage the autoscalability in the sense that you can have when you are working with AWS and autoscalers and all that kind of stuff. So this is 1 of the changes that we will have post so far after the next Buildbot 9 release.
[00:40:15] Unknown:
If you were starting from scratch on implementing Buildbot today, would you still use Python and why?
[00:40:23] Unknown:
Yeah. This is quite similar to the last question. And yes, we will certainly use Python. I have seen other colleagues that started building CI projects, in OGS or in Golang or in, Erlang, that kind of stuff. And that's that's really nice to, you know, have new languages and try new stuff with new languages. But I'm really amazed how much Python can he's old but good language that is continuing to I really like this language and even if there is the new shiny languages, you know, I've seen those people trying new languages to build new CI or test systems and I've seen them really in trouble, especially with the ecosystem and the libraries. For example, a very simple example is in Golan, some colleagues that have written some, you know, test orchestration system, they started to write in Golang, and then they started to, look at the, you know, the database option that they had.
And, you know, they have their own infrastructure guys that were very good at deploying, you know, MSSQL database, and and in Golan, this was there were no clients for MSSQL and and they tried to write 1 and it is so difficult to write a new driver for database. It's really difficult. So that was the main issue that eventually made them go back to Python. And they had to write their stuff again in Python because Python has all those drivers, all those, community and libraries. And for me, this is 1 of the biggest pool of of Python.
[00:42:44] Unknown:
Yeah. We definitely keep coming back to the ecosystem in this show. And as I said before, I have a coworker who really gets grumpy when I call that call it an ecosystem, but I I can't we I haven't found another better word to use, so I'm gonna keep using it. It's definitely true. Like, the database drivers, like you say, or some of the really incredible statistical or sort of data science tools that we have. I mean, I can only imagine that building rich metrics, based on your builds must be a heck of a lot easier because of all the things that Python brings to bear, and that stuff just doesn't exist yet in other languages.
[00:43:19] Unknown:
That's true. We last year, you know, with the Google Summer of Code, we had a project about adding more statistics and actually what we did is rather put a way to push statistics outside of, billboard. We we decided that there were already lots of tools for doing statistics. And so for for that example I would say we we can also use Python to easily push the data out of the Python ecosystem so we can still use other ecosystems for the static statistics for example. I can't remember the name of the statistics tools that those guys were using.
But anyway, so there is the 2 options of using Python in order to process your data with SciPy and all the other stuff. But you can also export very easily your data to doing R or whatever statistics tools that you want to use.
[00:44:28] Unknown:
And on the topic of statistics, what kind of metrics and data does Buildbot generate, and how important are tracking statistics in your CICD pipeline?
[00:44:41] Unknown:
We mostly write the data, on the fly and and those data, do a lot of statistics with that, okay, because everything is is stored in SQL. So you have several options. Either you can use the the REST API to access the data and then make statistics, or you can directly use, access the SQL database and then then do your statistics with that. Or the third option is to to make a plugin for Bilbao to export the data in another database. That usually that is what I do in my team. We we we instead of having a complex statistics system into Buildbot, we put the statistics system outside of Buildbot which allows us to more easily integrate with other, system other system that are parts that are part of your process like the review server, like the bug tracking system, like the the I don't know what other database can be part of your system. So I have seen other people doing exactly the same with, you know, the data warehouse model where you have a system that is collecting statistics from various sources, and then you can have queries on the statistics systems to really make the business intelligence that that, makes sense for for you.
So so yes, to the answer to your question is there is no statistics framework that is really, you know, a turnkey solution built into B. But there are options to easily output the statistics that you need into another system.
[00:46:59] Unknown:
And what are some of the most difficult challenges that have been faced in the creation and evolution of the Buildbot project?
[00:47:06] Unknown:
So there are, there are several that I can think of. So the scalability is the 1, you know, we already talked about that. So the second 1 which really has been painful is the API. And the fact that we have documented API, we have an API in order to create builders and steps, like I said, but then people need more. And then because everything is written in Python, it's very easy to look at how it's implemented and then use the API that is not API, the internal methods of the objects. And then when the buildbot maintainers are then reworking their stuff and making evolutions, they change this internal API.
And eventually you have people that are building great stuff with Python that are very happy with it and then they want to upgrade to, have the new functionality and the new stuff. And then it's very it's painful for them because they use those APIs that are not APIs, and and they, so so really they have difficulties to upgrade to new versions of Bbot and we are really trying hard to document the thing that you can call, and if it's not documented you shouldn't use it or, you know you should know that you are at risk when you will upgrade to just how to work in order to adapt to new API. So what I would recommend to our community is really if you need some more functionality then we should work together in order to make this API, this internal method, public so that we can commit the ecosystem to make sure those APIs will stay.
And so this is the work that we should do as a community to make sure that the API from Buildbot is enough for people to really build what they need.
[00:49:37] Unknown:
And what are some of the most notable projects using Buildbot and how do they uniquely leverage the capabilities of the framework?
[00:49:44] Unknown:
So, we have a bunch of big teams using BillBot. And, you know, you can see that Chrome and Chrome OS are using BillBot. Modular also have some billboards. My team also has a big billbot environment, but there are also tons, I mean, lots and lots and lots of smaller projects that are just doing, small stuff that are and they are using the framework because it's very scalable. I can think of a small friend from the community who is, you know, an open source shop doing open source jobs for various customers and he is using Buildbot to do the CICD for all these projects and, you know, they are a very small team and Buildbot is really helping them to manage all those customers at the same time to make sure that they can do their change to to their customers without knowing, that this will break orders or this will break the prod for their customers.
So they have, we have a lot of, people. And the thing is, Chrome, Mozilla, they are those kind of big teams, and even my teams, those those are the kind of big teams that really invested in Buildbot and then they are stuck into an old version of Buildbot. And so this is very difficult for them to evolve because, Chrome for example, they have complex they have changed the way the step is working into billboard, so that they have more flexibility to do 1 big script that is eventually generating several steps. So this is a very big change that they have. And ModEle app also have their own stuff, and and we are trying with Buildbot 9 to take in account all the stuff to to make a better framework so that if you have complex, needs, Buildbot can still be a good solution.
[00:52:22] Unknown:
What are some of the biggest challenges that people face when beginning to implement Buildbot in their architecture?
[00:52:30] Unknown:
So really for me, the biggest challenge for the new user is to start understanding, the framework. It's quite complex. There are some stuff that needs to be understood before, really, takes the the full power of the system. So while with Jenkins and SUEVIS, it's very simple to start simple, and with Buildbot it's a little bit more complex. So again, it's it's a ramp up phase that, you have to take, in order to to start, with Buildbot.
[00:53:12] Unknown:
So does Buildbot support the use of Docker or public clouds as a part of the build process?
[00:53:18] Unknown:
Yeah. This is this is something that we have we have a lot of feedback from the community and a lot of fixes for Docker and for EC2, that we are working on. Like I said previously, there are improvements that we need to do, in order to make this even more flexible, But we have Docker working and it's really really nice how powerful it is, the fact that you can start a worker and you can choose which Docker image you want to start this worker and you can choose according to, properties, according to, the branch of the project, you can just start to Docker on the same machine but with a different Docker image, with the different tools installed into this image.
And and this still starts in seconds. And so this is a really interesting way of doing scalability, of using the same build infrastructure. You know, you have the same you have 3 servers managing 70 projects, no problem. Because if this 1 project needs another Docker image, no problem. You can still run it on your infrastructure, and you can install this version of GCC for that project and this other version of GCC for the other project, no problem, and you don't have to have 2 separate workers for that. It can just run on the same infrastructure, but just different Docker image. So this part is really exciting.
[00:55:08] Unknown:
Right. It solves the perennial problem of a Skunk Works project having radically different environmental requirements than the mainline, but they both need to run-in the same build machine. So, yeah, I can see this being a huge a huge plus. So what about, public cloud support? Does Buildbot support the use of EC2 or Azure or GCE in in part of builds?
[00:55:31] Unknown:
Yes, it does. And we currently have a member of the community that is trying that is building 1 on those days. I don't know exactly the full details of of them, but, it looks like they are trying to build a quite big system. And from the feedback that I've heard, they are pretty happy on it and they are working on improving the system and they are willing to contribute new improvement to EC2, to the EC2 support in type B, but so that's also something that people have been using for long and, and that that is work working for people. So so this is this is working. Yes.
[00:56:18] Unknown:
Yeah. It sounds like a good use case for the Libcloud project if it isn't already part of Buildbot.
[00:56:25] Unknown:
So, that's a good question. For now we have 1 support, 1 different code for each of the latent workers that we are supporting. So we have 1 for EC2, 1 for Docker, another for another system that I can't remember of. But this would be really great if everything was using libcloud. Because with libcloud you can just have tons of workers that can just be instantiating with the same code. But we don't have yet anybody that has stepped over these tasks to add support for Libcloud inside BeBot. And this shouldn't be very difficult I think.
[00:57:21] Unknown:
So as you've mentioned before, the execution engine for a build bot is written with twisted. And I'm curious what benefits that provides and whether that has had any influence on your ability to port to Python 3 given the fact that Twisted itself has not yet completed that port.
[00:57:38] Unknown:
For Twisted, it's it's clear that, Buildbot is orchestration, tool. So its main purpose is to just send commands to workers and then gather the logs in a streamed manner. So the logs are just streamed into the master. And you have, 200 of workers that are just training logs 1 by 1. And you don't want to postpone 300 threads to manage that. So it's much better to just have a synchronous system doing that. So for me, you know, twisted or asynchronous is really the obvious way of doing orchestration tools. Then for Python 3 we had last year a Python 3 project for porting billboards into Python 3. And actually most of the work was to actually work with Twisted in order to support more of Python 3. And I know that Twisted folks really have been working hard on improving Python 3. It's not yet done but then you can still write some Twisty applications in Python 3 if you don't need some of those modules in Twisty that are not yet ported.
And unfortunately, billboard is using the perspective broker, the twisted spread framework that is RPC framework. And this framework is deprecated and twisted in favor of AMP which is another protocol much better. But we haven't yet made the work to transition from twisted spread to AMP. So we are still relying on prospective brokers and twisted spread to make the communication between the master and the worker And that is the main, blocker, for Buildbot to be able to run Python 3.
[00:59:55] Unknown:
Does Buildbot support build parallelization at all? For instance, splitting 1 very long test run up into 3 instances, each running a section of tests to cut build time?
[01:00:08] Unknown:
Yes. There there are several ways you can do that with Buildbot. You know, the legacy and the not legacy, the traditional way in Buildbot is to have 1 builder per stuff you want to run-in parallel and then have the source have your scheduler for each new change event, build 3 builders in parallel. So this is the way that build bots have been doing. And then another option is to do job triggering. So there is this trigger step that is allowing you inside 1 build to trigger further builds. And for me, this is really the the more flexible way of handling that, especially that you can do triggers of triggers of triggers.
And then you really can do a very complex CI pipeline with this trigger step. And I have been working a lot with the new buildbot and UI in order to have a better UI in order to work on this problem. With the Bbot 8, you had this trigger step but then you had to click on each individual build, triggered build, in order to see the results and see the logs. While in b bot 9 UI, you have the parent build build and inside this parent build you can just you have all the details of all the trigger builds and and you can, from that 1 page, look at all the details of why it fails and why it takes so much time and that kind of stuff.
[01:02:06] Unknown:
What are some of the most requested features for the project, and are there any that would be unreasonably difficult to implement due to the current design of the project?
[01:02:15] Unknown:
I don't think there is, unreasonable feature that we have. There are features that are complex and that need time. 1 of those is, you know, the changing the latent the way the latent slaves are working so that it's even more flexible. Another 1 I can think of is the rewrite of the protocol which is it's difficult and you need to change the core but it's not impossible. You know, from what I can tell, the architecture is flexible enough so that we can have the feature that people need. And so I don't think there are such features.
[01:03:06] Unknown:
So does Buildbot offer a plugin system like Jenkins does? Or is there some other approach it uses for custom extensions to the base Buildbot functionality?
[01:03:16] Unknown:
Yes. It has. We we had, lots of work that has been done by Sasha, which which is the maintainer for for the 8 branch. And so we have a plug in system using, you know, the package resource facility from Python, which is which is a capability to automatically detect which packages have been installed into your Python environment. So there is this capability and we with DWOT 9 we have extended it also to the UI. So the UI you can do plugins in order to add new pages, in order to add new custom parameters that you want to implement, if you want to force a build, you can make custom UI in order to let the users implement a really complex formula or UI for for allowing users to start builds.
So, yes, we have such capabilities.
[01:04:35] Unknown:
That's fantastic. Kinda speaking to Tobias' next question, I know myself, I'm a, you know, infrastructure as code person at my current job and will be my next job as well. And, we do a lot of automating of, of build slaves, automating the creation of build slaves rather. And it's always a challenge with Jenkins or even buildmasters. And it's always a challenge with Jenkins because plugins and plugin installation are are kind of this bizarre GUI driven thing, And so having it having plugins and extensions just be Python packages that you can, manage and install just like any other would really simplify things in that regard. That's a great idea.
[01:05:22] Unknown:
I thank you, Sasha, to come with this idea. It's really a great improvement. And, yeah, the we don't have yet, the community that, we'd like as a plug in. So we we have 1 plug in, actually, we have 2 plug ins. 1 is in order to describe a slave, a set of workers, and assign them capabilities. So this is a plugin that has been released quite recently to add capabilities to workers and so that builders can tell: okay, I just want slaves that have Windows and also have MySQL installed. Okay? That kind of capability. So we also have 1 plugin that is reimplementing Trevis, which is just implementing a process that is loosely inspired from Trevy. It doesn't have all the features that Trevy has, but just it adds a web page where you can configure your projects, where you can say: okay, I need this project and this is the Git URL.
URL and then it will automatically reload Buildbot and configure this project, add builders, and then listen for events and then look at the Trevis that are C file to build the steps that are configured into your project. So this is something that is quite easy to do within this plug in system.
[01:07:17] Unknown:
Managing a reliable build pipeline can be operationally challenging. What are some of the thorniest problems for a buildbot in that regard? And what are some of the mechanisms that are that are built in to simplify the operational characteristics of managing a CICD pipeline with Buildbot?
[01:07:33] Unknown:
So I'm not sure exactly what kind of operational channel you you mean.
[01:07:41] Unknown:
So I'm thinking in terms of actually building and deploying and managing the servers that the system is running on and getting them configured appropriately. So as Chris mentioned, doing this with Jenkins is difficult because it's not very eminently automatable. So I'm curious. What aspects of Buildbot simplify that process and what automation in a build bot architecture looks like? So
[01:08:04] Unknown:
we tried to really stay focused on our main orchestrator functionalities. And actually Dustin, who is the traditional maintainer for Buildbot, he's an IT and infrastructure guy managing IT for Mozilla and he knows that there are so much other tools that allow you to to manage your infrastructure that it shouldn't be up to, Bilbert or Jenkins to allow to to to to add some feature on that. So we prefer to concentrate on really the orchestration part rather than really administrating the infrastructure. So for the infrastructure we we just advise people to just use Ansible or Chef or whatever infrastructure tools, that you can think of.
What are some of the challenges around supporting slaves running on platforms with very different environmental characteristics like Microsoft Windows? So it's, Windows, it's not that much difficult because Windows is supporting Python quite well. There are some difficulties, like I said, when you try to stop a build sometimes the Windows slave is not so happy and you have some, commands that stay. But apart from that it's not that complicated. 1 of the challenge that some other people had is in order to automate some more embedded stuff where you don't have Python environment in that embedded stuff. So this adds some interesting challenges for those people.
And we have been trying to think about that as we are reworking the master to worker protocol, we are trying to make a protocol that could be possible for people to implement with a very lightweight, worker, implement that in C or in Go or whatever. So apart from that, from the start really build bots, 1 of the best features of it is really it can run on all the major PC like systems, you know, Windows, OS, OSX, Linux. We are having a bunch of free BSD slaves also this is working without issues. And that is 1 of the most also 1 of the most, advantages of using Bbud compared to, you know, Trevis.
With Trevis you can you can also use OSX and Linux and then there are some complications also because you cannot do both at the same time or if you want Windows then you have to use Advair and then your and then you have to configure differently on Windows because Advair doesn't have the same configuration file and all that stuff. So with Buildbot you can really make factorization for all those OSes that you want to support, then you have the same configuration file. And whenever you need to have some specifics you can just very simply change the command that you want. You just do, okay, if this is a Windows slave, then I do another command.
[01:11:44] Unknown:
So what mechanism does you'll forgive me if you mentioned this earlier. What mechanism does Buildbot use to communicate with its slaves? Like, it's not SSH. It's its own does it use crossbar or something else?
[01:11:57] Unknown:
I actually made a proof of concept of using crossbar to communicate between master and slave. And I have, it was working and then I didn't pursue it because 1 of the most difficulties by doing a master worker protocol is to make it reliable, very reliable, making sure the master is knowing when the worker is disconnecting, do we try on disconnection, managing the TCP and reliability, that kind of stuff. And so we have been working so like I said we've tried to spread and this protocol is very reliable for us right now. So that's also the reason why we are so we don't want to touch it too much, it has some magic in it, it has some you know, architectural, weirdness but it works very well and it's very, stable.
So probably after BILOT9 will be released, there will be some more focus on improving this protocol. And about SSH, we had some people implement I know that we had some people implementing SSH based protocol for embedded, especially because they didn't want to install Python in their actual slave, so they add some so what they did is to have another layer, so they have a worker that then translates commands into SSH. So this was a little bit complicated but this worked well for them so that's what we care about. That's it's eventually working.
[01:14:01] Unknown:
So what is on the road map for Buildbot that people
[01:14:07] Unknown:
should keep an eye out for? The main, thing that is the roadmap and the main thing that we are focused on right now is the release of Bbot 9. And, you know, Bbot 9 it is a big project to put all the storage into a database. And then once you say that you need to rewrite all the web UI because the web UI was so bound to the old, you know, pickle based API. And people were also complaining that it was so old and, you know, not very shiny. So we decided also to rewrite the full UI into, you know, a modern AngularJS web application.
So now with BeBot 9 we have a clear API, a clear REST API that the UI can use, and we also have an event subsystem that the the web UI also use in order to implement, you know, to in order to implement a live UI. So whenever you start a build, you can see the build and then everything is updating in real time. So all those, you know, hosted CI systems have implemented that. And I know that Jenkins have some of that, but we really worked a lot in order to have live updates in the core of the UI. So everything is updated live. It's not just, a gquery, pooling on just some of the things everything is updated. If you change the configuration of the builder and you are in the builder page then you will see that configuration change in live. Everything everything is live. And that was a quite interesting, problem. We also had a Google Summer of Code, students working on that, last year. So we are very excited because this long this is a 4 year project that is nearly up to the point it's released. We have more and more people joining Bbot9 and trying this new stuff. We have already released 8 beta for this BuildBlock 9 and probably the next release will be a release candidate.
[01:16:55] Unknown:
I think that, even just the UI would be something for people to be excited about, but also having the unification of the results data and all of that into the database is a pretty pretty big step forward as well. So definitely something for people to look out for, and I I think I'll probably be trying it out if nobody else does. So are there any topics that we didn't touch on that you wanted to bring up or anything else that you wanna cover before we move on?
[01:17:25] Unknown:
No. I don't think so.
[01:17:28] Unknown:
So for anybody who wants to follow you and keep up to date with what you're up to, what would be the best way for them to do that?
[01:17:36] Unknown:
So we, you know, we have this, b dot billbot.netmainwebsite where there is all the contacts. We are mainly discussing on mailing lists and IRC, and also having a lot of discussion on GitHub for put requests. So we are a very live community and very, friendly This is something that, we have feedback from a lot of people, that are amazed how the billboard is is Friendful and is helping the the newcomers. And especially as we know that it's difficult for newcomers to to start, we really try to to be as much friendly as possible and to, we have always people on IRC that can help, and also on mailing lists we try to answer as much as possible in a friendly manner. And then, yeah, we are we are working for your pull request, inside GitHub. It's very easy. You just have to fork, the project and then you do your contribution and we'll be happy to discuss with it and to help you having your future integrating integrated into the mainline.
And we really, really encourage people to upstream their code change. We know that there have been a lot of people changing the framework and not observing, it's because they either are too shy to observe it or either say, okay, this feature is really just for me and, I I I don't want I I can't share it because, this might not be useful for people. But please just, at least tell us, put a track, put a bug into our bug system to say, I need this feature, how do you think I can resolve that in a generic way so that it can help everybody? And for for the people who had done heavy modification, then they they are having issues afterwards to upgrade. And this is this is a shame because then, they they cannot use anymore for the new community improvement. And the the whole purpose of open source is to be able to to be helped by the other people and to to have the enhancement for the community, come to you just by the way of upgrade.
[01:20:30] Unknown:
It's kinda 1 of the potential weaknesses of the fork model. Right? It can create a 1000000 heavily modified forks that never find their way back
[01:20:42] Unknown:
to the origin project. Yeah. And this this might, probably also be issues for a lot of other open source projects. The thing is, buildbot, we have seen that a lot. And and also the fact that, you know, when you are building a CI system you want you need you know your main work is not to build a CI system, it's to it's to actually deliver the software you are trying to deliver. So sometimes, you know, the CI system it's worth, you know, we the people have, you know, 2 months to build it and then they build it as fast as possible in order to be back into the main system. So we have seen that also.
And so we cannot do too much on that, except that trying to make people understand that they have, reason. It's a win win situation. You contribute your change to open source, and then you you get also the contribution from, the other members of the community.
[01:21:56] Unknown:
With that, we'll move it on into the picks. So for my first pick today, I'm going to choose the Viking Double Edge Safety Razor. So I decided to try out a safety razor for shaving with because the disposable razors are kinda cheap and wear out pretty quickly, and they're also kind of expensive to get replacement blades. So so far, I've been enjoying it. Definitely recommend anybody who's interested to check out this 1. It's got a butterfly opening mechanism, which means you just twist the bottom of the handle and it opens up. So it's really easy to replace the blades. So, yeah, I've been enjoying that. And I will pass it on to you, Chris.
[01:22:33] Unknown:
Thanks, Tobias. I have 2 picks today. The first is a rather novel game for, for mobile platforms. It's available for iOS and Android. It's called Lifeline. It's pretty interesting, and it even, will, you know, support your watch if you have a a a smartwatch, which I do not. But, in any case, basically, there's this astronaut who is has crash landed on a moon and has communicated with you. You somehow, she she raises you on the radio, basically, and asks for your advice, navigating the moon's surface and and just exploring and doing various thing things and trying to keep her alive. So it's basically kind of a a modern mobile version of the choose your own adventure except without the cheeky page number references.
It's a lot of fun. And and again, just sort of a really interesting different game concept. I really appreciate that in the in the modern world of, you know, way too many, shooters. My next and last pick is, a variety of sake. A friend of mine had a sake tasting party this last weekend, which I really really enjoyed, and I encountered a variety there that I had never tasted before. It's called Suzaku. The link is in the show notes. No, not notes. Show notes. And, it's a really, smooth so, basically, it has a kind of a sweet fruity start and a really smooth finish. It's just really delicious, and and we had it chilled. I I definitely plan to keep some in my fridge for special occasions.
And with that, Pierre, what picks do you have for us?
[01:24:14] Unknown:
I don't know. I don't really have 1 that can that I can think of right now.
[01:24:21] Unknown:
No problem.
[01:24:23] Unknown:
Pierre, we really appreciate you taking the time out of your day to talk to us about Buildbot. It's definitely an interesting project and 1 that I'll be taking a closer look at, And I'm sure a number of our listeners will as well. So thank you for that, and I hope you enjoy the rest of your day.
[01:24:37] Unknown:
Thank you very much. Thank you. Bye bye.
Introduction to Pierre Tardy and Buildbot
Pierre's Journey into Python and Buildbot
What is Buildbot?
Pierre's Involvement and Evolution of Buildbot
Understanding Gerrit and Its Role
Buildbot's Internal Architecture
Buildbot vs. Other CI/CD Tools
Scaling CI Systems with Buildbot
Design Choices and Future Directions
Challenges and Community Contributions
Notable Projects Using Buildbot
Getting Started with Buildbot
Docker and Cloud Support
Build Parallelization
Plugin System and Custom Extensions
Operational Challenges and Automation
Supporting Diverse Platforms
Future Roadmap for Buildbot
Closing Remarks and Community Engagement