Summary
Server administration is a complex endeavor, but there are some tools that can make life easier. If you are running your workload in a cloud environment then cloud-init is here to help. This week Scott Moser explains what cloud-init is, how it works, and how it became the de-facto tool for configuring your Linux servers at boot.
Preface
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable.
- When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.podastinit.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
- Need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.pythonpodcast.com/pluralsight?utm_source=rss&utm_medium=rss to start your free trial today.
- Visit the site to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch.
- To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
- If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science then make your way to the Open Data Science Conference, happening in London in October and San Francisco in November. Follow the links in the show notes to register and help support the show in the process.
- Your host as usual is Tobias Macey and today I’m interviewing Scott Moser about cloud-init, a set of python scripts and utilities to make your cloud images be all they can be!
Interview
- Introductions
- How did you get introduced to Python?
- What is cloud-init and how did the project get started?
- Why was Python chosen as the language for implementing cloud-init?
- How has cloud-init come to be the de-facto utility for managing cloud instances across vendors and distributions?
- Are there any viable competitors to cloud-init? coreos-cloudinit, some others.
- How much overlap is there between cloud-init and configuration management tools such as SaltStack, Ansible, Chef, etc.
- How have you architected cloud-init to allow for compatibility across operating system distributions?
- What is the most difficult or complex aspect of building and maintaining cloud-init? [os integration, networking, goal of “do stuff without reboot”]
- Given that it is used as a critical component of the production deployment mechanics for a large number of people, how do you ensure an appropriate level of stability and security while developing cloud-init?
- How do you think the status of cloud-init as a Canonical project has affected the level of contributions that you receive?
- How much of the support and roadmap is contributed by individual vs corporate users such as AWS and Azure?
- What are some of the most unexpected or creative uses of cloud-init that you have seen? [https://wiki.ubuntu.com/OpenCompute?utm_source=rss&utm_medium=rss “disposable use os”]
- In your experience, what has been the biggest stumbling block for new users of cloud-init?
- Do you have any notable features or improvements planned for the future of cloud-init, or do you feel that it has reached a state of feature-completeness?
Keep In Touch
- smoser on GitHub
Picks
Links
- IBM – Linux Technology Center
- Cloud-Init
- Ubuntu
- Canonical
- CoreOS
- EC2
- OpenStack
- CentOS
- RHEL
- coreos-cloudinit
- JuJu
- Puppet
- SystemV
- Upstart
- SystemD
- Joyent SmartOS
- Digital Ocean
- IPv4
- IPv6
- Canonical MaaS+
- JSON-Schema
- LXD
- Launchpad
- Bzr
- Git
- SUSE
- FreeBSD
- KVM
- Go-lang
- Pretty Table
- RAID
- ZFS
- LVM
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. I would like to thank everyone who supports us on Patreon. Your contributions helped to make the show sustainable. When you're ready to launch your next project, you'll need somewhere to deploy it, so you should check out Linode at www.podcastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your next app. Do you need to learn more about how to scale your apps or learn new techniques for building them? Pluralsight has the training and mentoring you need to level up your skills. Go to www.podcastinit.com/pluralsite to start your free trial today.
You can visit the site to subscribe to the show, sign up for the newsletter, read the show notes, get in touch, and support the show. To help other people find the show, please leave a review on Itunes or Google Play Music, tell your friends and coworkers, and share it on social media. If you work with data for your job or want to learn more about how open source is powering the latest innovations in data science, then make your way to the Open Data Science Conference happening in London in October and San Francisco in November. Follow the links in the show notes to register. Your host as usual is Tobias Macy. And today, I'm interviewing Scott Moser about CloudInnit, a set of Python scripts and utilities to make your cloud images be all they can be.
So, Scott, could you please introduce yourself?
[00:01:27] Unknown:
Hello. This is Scott Moser. I am a Ubuntu core dev and employed by Canonical, and I am the primary maintainer of CloudOnet. Been
[00:01:40] Unknown:
working on that for, I think, probably going on 7 years at this point. And were you around when the project got started or did you get involved after it was already in motion?
[00:01:49] Unknown:
I started very close to the beginning. At at that point, it was called EC 2 in it, and it did probably about 1% of what it does now. And it's very young in its in its life cycle. It was originally started by someone named Soren Hansen, who has gone on to other things since then. And do you remember how you first got introduced to Python? So I I first got introduced with to Python largely just as a as a language that I saw code being written in, and I had to work with. I worked at IBM IBM on in the Linux Technology Center. And so a lot of my job was touching a wide variety of different softwares that make up a Linux distribution or contribute to different things. And so inevitably, you come across Python. And so that's where I got an introduction. I'd say the Cloudnet is probably the first my first large introduction to Python, and I think that probably shows in some of the older bits of code that are there. Over time, I think I've improved a little bit, and hopefully the code is better written now than it was and improving always. So And you mentioned that the initial incarnation of the project was as EC 2 in it. So I'm wondering if you can just briefly describe
[00:03:02] Unknown:
what the ClouderNet project is and some of the history of how the project got started and also the reason for Python being the language for implementing it.
[00:03:13] Unknown:
The initial beginning would have just come from from the time when Kanaka was putting images of Ubuntu on Amazon. And so inside those images, you you needed to do a couple things, the simplest of which was just consume user data so it can be supplied to a instance at launch time and and reacts to that. The first thing that that was there basically grabbed the user data. And if it started with a shebang, it would execute it. So that went that went a long way. And then ec2 init came as a Python implementation of what was there, but was ec2 specific. And then so at some point during the life span of the project, we we added support for OpenStack and different clouds, and at which point, we decided to take the EC 2 portion out of it and make it more generic.
Yeah. So as far as the language and why why Python was chosen, I think that it definitely at that point in time, it would have been, like, 2009 or 2010. A lot of Ubuntu development was being done in Python and still still is the case. You know, it's definitely a very strong language choice due to its presence on just about any Linux and, and just the the ease of use and ease of development that comes with Python.
[00:04:43] Unknown:
And so the user data input was already present on EC 2 and the Cloud in it. At that point, EC 2 in it was used as a way of consuming that. So before the e c 2 in it project, do you do you know what the sort of use case for that user data input was for other distributions on the EC 2 platform?
[00:05:05] Unknown:
That's a good question. So, honestly, the first images to my knowledge that were put up in any sort of public manner were done by a contractor for canonical, and he he used I believe that Eric Hammond came up with the idea of just executing the user data if it if it started with a shebang. And I think that was a that was a wonderful idea and and magically increased the functionality of a of a stock image to do automation on it. Now other what other things we're doing with user data at that time, I'm not really sure. I think, you know, Amazon designed it as just a I believe it's 16 limited to 16 k binary blob, and it was just kind of a a way for you to feed data to the instance, and then the instance would decide what to do with it. To my knowledge, though, the first user of that was was Ubuntu's images to at least use it as as consume shebang.
[00:06:10] Unknown:
And so since the time that Cloudynet was introduced, it has grown to become the de facto utility for managing cloud instances across vendors and distributions as far as being able to consume that user input at boot time. So I'm wondering if you can share a bit about how that journey has progressed and how you think CloudInnit has come to become that de facto standard.
[00:06:34] Unknown:
So CloudInnit initially, the the shebang just executing user data server with shebang got got you so much for so little effort, and it was really just a a fabulous idea. But then as over time, we wanted to be we need wanted to be able to do some more things and wanted people to be able to declare things in a more declarative language to configure Ubuntu or another operating system in a more declarative manner. So the first thing I I think maybe that was added would be just the ability to say these packages should be installed and give a list of packages. And then Cloudnet got some function to be able to write write files that you could you could add a bunch of you could declare some files, their path, their permissions, and their executable mode, and then Cloudknit would write those down. So over time, it just kinda grew in function and and hopefully also, usability. Now I think as far as to how it became the defacto standard, I think the the largest contributor to that was, CloudNet's presence in Ubuntu and Ubuntu's success from a a web 2.0 operating system perspective.
Basically, it being so popular to use on on Amazon initially and then on onto other clouds. So, you know, people chose to pick Ubuntu, and then by by that nature, then they were able to use Cloud.net. That has also then helped you know, that it helps CloudInet get into other operating systems, your CentOS or RHEL, and because people were familiar with it on Ubuntu.
[00:08:22] Unknown:
As far as you're aware, are there any other viable competitors to Cloudinet that other distributions are making use of?
[00:08:29] Unknown:
CoreOS, when, you know, came out much, what, maybe, what, 3 or 4 years ago at this point, but and they looked at using Cloud in it, but were unhappy with some facets of it. And so they chose to implement 1 of their own and they chose to do so in Go. So there is a CoreOS Cloudnet, which consumes a portion of the cloud config, that Cloudnet does. And then there's another there's another similar project that works on OpenStack. It has a smaller footprint than Cloudnet does and smaller functionality, but kind of basically just enables the OpenStack data source to, for users to interact with that. That name is eluding me at the moment is why.
[00:09:16] Unknown:
Well, if you come across it, we can add it into the show notes afterwards. So a lot of the use case for cloud in it is to be able to bring your cloud instances to a particular known state during boot time, which is also a large portion of what configuration management tools such as SaltStack, Ansible, Chef, Puppet, etcetera, are used for. So I'm wondering how much overlap you've seen between those 2 sort of bits of tooling and how much of a crossover there is as far as people using CloudInnit to bootstrap their configuration management or vice versa?
[00:09:51] Unknown:
Yeah. So my my feeling initial intent for CloudInet was, that it was in fact in it, that it would be a way to initialize your instances and kinda get them to the next thing. So I think 1 of the early contribute contributions from someone other than myself to Cloudnet was to enable you to declare puppet configuration, and then Cloudnet would install the puppet package, lay down the puppet configuration, and then join in with your your puppet system to let it take over. And that's kinda still, I think, the primary use case, and I I really think it's it's valid, is that you you already have some management system already present, and you kinda wanna make use of that. Cloudnet, also, another user of CloudOnet, which is similar, is Canonical's Juju.
When it launches instances, it it declares some things to Cloudnet, which then help it to integrate with its with its, daemon and go from there. We do have support for Saltz and Ansible and Chef, all all similarly in Cloudnet to where you can basically bootstrap and get to those as a management infrastructure. So as far as going past that, I think Cloudnet will we've had requests for Cloudnet to do things post boot or to do more life cycle management. And I think that that those sorts of things will come over time. I still definitely want to at at all points, I think that there will exist other main, instance management solutions.
And so CloudNet should should integrate with those and basically allow you to get to those and then not, you know, not preclude you from using them.
[00:11:38] Unknown:
And given that Cloud and it needs to be able to run across multiple different distributions and cloud environments, How have you architected the project in order to be able to minimize the amount of, sort of rework and ongoing maintenance that's required to be able to maintain that compatibility?
[00:11:59] Unknown:
So I think there's probably 2 axes of variation in CloudNet. You've got the a data source, which is in in CloudNet lingual, which is a which is which retrieves information from a cloud platform. So that would be, you know, Amazon or Azure or DigitalOcean or IBM Softlayer. And then you've got the the distribution, which is the other path. And so the data sources are fairly well separated out. They're not as well architected as I'd like, and we're doing some work to to kind of improve how they're how they're used, how generic they are. So we'll have more consistent use of them at at some point in the near future.
And then the distribution also, when when at run time, the code that is distribution specific runs through a a distro object, and then the distro object can handle doing some of the interaction with the operating system, such as installing packages or configuring locale or setting up NTP or time sync d. That that that sort of stuff goes through the distro.
[00:13:18] Unknown:
And what are some of the most difficult or complex aspects of building and maintaining the project?
[00:13:24] Unknown:
So I'd say, by far, the most difficult thing has been the initial goal that I set out with, which was when you launch an instance, you should be able to take a generic instance and turn it into 1 just like you would have customized yourself without doing a reboot and without having to restart services. So the goal there was that Canonical was making images and publishing them to different clouds. And we want people to be able to use those without having to figure out how to build their own images and publish their own images and deal with all the heartache that is involved there.
We want, 1, we want to be able to publish something that would be usable immediately to anybody. So what that meant was that CloudNet needed to run very early in the operating system in those initialization so that it could affect later parts of boot. And so that that tie in with the operating system, it has been complex, especially across the different Linux distributions and different init systems that you have. You know, still Cloudnet supports system 5 init upstart and system d at this point. And then also, on runs on FreeBSD, which is more of a sys sysv init. But integrating with the operating system is has been the difficult thing, and guaranteeing the points and boot that that I'm interested in in allowing people to interact with.
The second thing that's been of real interest is, lately, we've had a a focus on getting networking support into CloudNet. And by that, it means being able to read the information that is available on the cloud platform and apply it to the operating system. So on Amazon, there's there's not really a lot of network configuration you can do. You basically you can get additional IP addresses or additional NICs, and at this point, Cloudnet doesn't do a good job of taking advantage of them there. But on other operating systems like or other cloud platforms, like Joiance, SmartOS, or OpenStack, we get a fairly rich declaration of networking that may include IPv4 and IPv6 or multiple addresses or ipv6 addresses across multiple NICs with different routes.
And so Cloud NAT is being changed to be able to read that information and apply it to the operating system so that it the operating system comes up as if it were initially configured in the image. That's been somewhat difficult in that we had to that it it really stresses the tie in with the operating system. And and, honestly, the, networking solutions I've I've been very surprised that at the networking solutions on both Ubuntu and on CentOS where we've or on RHEL where we've done this most thoroughly, it's really quite surprising how hard it is to declare networking in a way that is that will reliably come up. Some issues we've had have been with, like, static routes being declared and when when to raise them and just different different interaction with networking that we've come across when we've really pushed automation of it, that it's really fallen out. I think that we're taking care of a lot of that stuff.
And and at this point, you can fairly well declare networking to CloudNet and have it come up, but it wasn't easy getting there.
[00:17:16] Unknown:
Yeah. Networking is always sort of the, bugaboo of anybody having to deal with systems because while there are a lot of really robust standards that are in place, the actual implementation and interaction of all those different layers of those standards can become quite difficult to be able to understand and reason about, and particularly the implementation of some of the finer points across distributions and across, network environments in general is highly sort of, unique between
[00:17:48] Unknown:
different environments and different use cases. So Right. Yeah. That that's exactly it. And so for for example, in in Canonical's, MAS, metal as a service, they we they can declare to the installer, which then at this point is passing that through to Cloudnet, that they'd like that there exist these 3 NICs. They'd like 2 of them bonded together and a, multiple IPV fours or IP 6 and IPV sixes on them. And, gosh. What's it called? The t what what's the networking that sits on man, I can't think of this. Oh, the VLAN? VLANs. Yeah. Right. Gosh. So, yeah, so, essentially, you can you can put a VLAN on top of a bridge on on a VLAN on top of a bond or, you know, on a bridge and all that stuff.
Basically, anything that you could have modeled, now you can declare in a in a language to Cloudnet, and it can render it into being on Ubuntu and as much as possible on RHEL or other operating systems. And it it's just amazing at how finicky a lot of those networking systems are, especially given that the dominant network operating system in the world is Linux. And these 2 platforms that I'm trying to interact with, you know, and it's amazing at how many edge cases we hit.
[00:19:22] Unknown:
And your point too about being able to pass that information along to the different layers brings up 1 of the issues that I've seen a few times working both with ClouderNet and other systems, particularly where you have a declarative language that you're then trying to interpret into commands to multiple different tools is being able to propagate that information appropriately and mutate the data structures as needed so that it's easy for an operator to be able to input the information, but then it also gets, put into a usable form for the end tool, which may need it in a completely different format. So maybe you're writing it in, YAML, but you need to render it in XML or something like that. So, what are some of the ways that you've tried to mitigate some of the pain of translation both from the user perspective and then also within the system?
[00:20:13] Unknown:
In the state right now, where where we are, we're we're lucky and we're we're trying where it's fully declarative. And right now, we don't really support kinda user interaction after the fact. And so as long as you can get it right on the first boot, you're okay. I realize that's not a a tenable position going forward. We're gonna have to allow for users to, you know, to append or improve their you know, make changes to their networking in CloudNet, you know, be able to handle that well. But, yeah, it's it's been interesting. I'd say that the biggest piece of sanity that we've gotten out of this is, another project I work on is called CURTAIN. It's, the CURT installer, and so it's it's, like, basically, FastPath installer for Ubuntu. But in doing its test cases, we we have a a v it's called VM test, and actually we do an installation to a system where we can declare whatever networking we want across we can attach multiple NICs and declare the networking, and then we do the installation. We boot the system up, and then we verify that the that the network networking is set up as we expected. Now we don't actually test network traffic at this point.
So but we are checking that, you know, the the devices get created, the devices are stacked correctly, you know, that basically the system appears to be completely functional, but we don't we haven't, you know, we don't actually have any in a VLAN setup that we would do VLAN traffic on to verify it. But it's completely all done in in an automated fashion, so it's really nice.
[00:21:57] Unknown:
And 1 of the complexities that I've seen while working with CloudInnit is the ability to verify your configuration, and, generally, that manifests as just put in the cloud and new information, launch the instance, and cross your fingers and hope for the best, and then try to debug any issues that come up so you end up running through some fairly long cycles of rate your config, boot, wait for everything to apply, and then retry if you get it wrong. So are there any ways to short circuit that cycle where you can actually verify things locally with before you have to push it to the cloud? Yeah. So that
[00:22:31] Unknown:
that's a real pain, isn't it? And that is a 1 of the largest bits of feedback we've gotten. And it was 1 of the reasons that or 1 of the things that CoreOS and their their cloud implementation really tried to address. There's a couple things that that can help in that scenario. We're we're doing development right now on using JSON Python JSON schema or JSON schema to declare what each of those config snippets can look like. And 1 of my coworkers, Chad Smith, has just been working on adding JSON schema and validation to some of the modules. So over time, we'll we'll add JSON schema validation to different modules and with an end goal of being able to have a web page or a command in Cloudnet that you can say, hey, validate this as correct input and then and get well annotated. He's he's done a really good job of, like, annotating online numbers and everything that the output of where you have an error. So that'll that'll really go a good long way to helping with that. The second thing is they're declaring at the in the input is in the the right format. Unfortunately, there's there's more inputs than just than just a cloud config. Right? A the data source or on a given cloud, you may have multiple disks attached, or you got multiple NICs attached, or, you know, there are there are more variables than just what is there. And so and then also, if you're providing something like, packages to install or commands to run, or there's kinda a lot of extra inputs and a lot of extra potential for error. And so the thing that that I've done that is most useful to quickly, to quickly iterate on those things and kinda shake out issues early, the thing that's been most useful has been, LexD.
That's and and not to just tout a canonical product because it's a canonical product, but LexD previously known as LXC is an amazing piece of technology that you can that you can launch and, you know, start a container. It's a full system boot of a container. So just about anything you can do with CloudNet on a public cloud, you can do with it inside of a container. But the startup time is, you know, measured in less than 10 seconds, and the tear down time and the cost is about 0. You know, you can launch an instance on on DigitalOcean and be paying $10 a month and have that as your development as a system where you launch dozens of containers inside of it, interact with them, verify them, and then tear them down. So it it's a real boon to quick development iteration.
[00:25:12] Unknown:
And given that Cloud in it has become such a critical component of so many people's production deployments and the mechanics necessary to ensure that they have their system in a known good state. What are some of the approaches and tooling that you've put in place to ensure that there's an appropriate level of stability and security while you're developing cloud in it and trying to maintain a forward progress on the feature set, but also preventing any regressions or backwards incompatibility?
[00:25:41] Unknown:
The biggest thing we have right now is going on is work on the integration test suite. So we've got contributed by another person, Wesley, who is 1 of our interns, fabulously capable, college student. He contributed a integration test suite that launches instances right now on, LXD. And so we can get a whole lot of validation of cloud config and other interaction with CloudNet that way by just launching an instance and then inside of it verifying that that things look like they do. And so we can get a whole lot of coverage of the the path for CloudNet via that. Now our our next steps in development are to extend that to running on, different platforms also. So in or in in addition to being able to launch an instance on LXD, to be able to launch 1 on Amazon or Azure or, Rackspace or IBM Softlayer or any of the other clouds. And then with that integration test suite in place, basically being able to take a take trunk, pull an image, shove the new cloud into that image, publish it to the cloud, and and run through a battery of tests on it. So that's that's the best path we have to sanity there. And there are a lot of variables there. You know, the operating system is moving underneath you, and, they have different a lot of a lot of different platforms. So to be perfectly honest, at this point in the in over the past, it it's been hard to maintain stability and due to the the complexity of kinda what's going on there. So but we're that that's a primary thing we're targeting and improving. Now for security, Clonidine kinda sits in a in a place where security is not so, so important. I don't ever wanna mean I don't ever want to imply that security is not important, but, CloudNet runs as root and consumes, data from the metadata service on the cloud provider. If, you know, if that happens over a network, then if that network is compromised at at a point early in boot, you're kinda open to a lot of attack anyway, and there's not a whole lot you could do about it. Obviously, there are things that we could we could come up with it to improve it, but Clint kind of sits at the point where if your system is vulnerable to attack that early in an instance lifetime, that there's there's likely to be trouble. Again, I don't wanna discount security as an important thing by any means.
But and as we go forward, kinda extending to post boot action, it it will be more and more important.
[00:28:35] Unknown:
Yeah. I definitely agree that given the fact that it is running as root and at the very beginning of the instance life cycle, there's only so much responsibility that you can really take on, and it's important to define the limits of what your sort of commitments are in terms of ensuring any measure of security. So given the nature of the project, it's definitely largely placed upon the user to ensure that the inputs that they're providing and the way in which they're providing that input is done in a manner that's going to prevent any sort of exploits. And the tool is basically just consuming whatever it's given, and you can't really you can't really be expected to prevent users from doing foolish things.
[00:29:14] Unknown:
Right. So cloud ClouderNet is, in some way, is a is a arbitrary command execution platform. Right? By the by 1 of its design points. So if you give it arbitrary code to execute, it will it will happily execute it. Now it is upon Cloudnet in many cases to make sure that that code that is coming from a source that is that you intended and not to mess up and pull it from somewhere else. But that's about that's kind of the limits of it right now. And how do you think that the status of Cloud in it as a canonical project and the fact that the,
[00:29:49] Unknown:
development happens on Launchpad as opposed to the ever ubiquitous GitHub, How do you think that that has affected the level of contributions and the type of input that you've received as the project has progressed along its lifetime?
[00:30:05] Unknown:
So let's see. So a couple questions here. So as a canonical project, I think, you know, they're for for whatever reason, there definitely is a there is a set of people who who won't who are not interested in contributing to a a canonical backed project or may at least initially bulk at the opportunity to do so. That said, ClouderNet has been successful as a upstream project that has been, you know, accepted into distros and is a we've kinda over we've overcome we've kinda over we've overcome that. I won't pretend that there weren't that there weren't some hardships as as a result, but I think that largely that's that's accepted now as just something that that is. And, you know, we we act as a reasonable upstream. And so the track record of doing so is helps people to realize that they should they shouldn't worry too much about that. But that said, we have it during 1 point in the lifetime of CloudNet. We've moved as a result of user input from being a GPL 2 license to what is now dual licensed as Apache 2 or, GPL v 2. I say that, but I wanna make sure that I'm sorry. GPLv3.
So, it's now dual license under Apache 2 or GPLv3. And actually, I think the license change there made people really squash some concerns that people had and, you know, felt more compelled to contribute. The second part of your question was with regard to Launchpad. I'll I'll expand on that. You know, initially, we initially, ClouNet was in was in BZR on Launchpad. And I wanna say that if if you're I think that realistically, if you're a developer of a piece of if you're a if you're a software engineer and you've got a problem piece of software or you want to improve a piece of software and you come across the fact that it's in a revision control system that you don't that you're not familiar with. The reality of that stopping your contribute contribution, I think, is small if in fact you were really interested. Right? If you want to accomplish a if you need to accomplish something at work and, you know, you're in all sorts of times, we come across a piece of it a piece a new piece of software that we have to interact with, that we that we have to learn as a result. It's it's it's not really something, a justifiable complaint in my brain. But, however, it it it I think it's used more often as an excuse. However, as we moved from BZR after kinda after BZR was kind of marked as end of life, we moved off of BZR and over to, cloud it as and over to Git. I think that that change did help. And, I mean, it definitely redo removed a variable of restraint from people. So we have gotten some more contributions there, I think as a result of being on Git. And, actually, definitely, I I have enjoyed being on on Git, and and the freedom and the ability to do some things that were difficult in BZR. I really enjoyed that. Now lastly, the Launchpad versus GitHub argument or question. I think really similar to, you know, if if you're not willing to learn a new piece of software to contribute to a to a project, then probably weren't really that interested. I think the same is true for for Launchpad versus GitHub. There is a the 1, you know, the 1 initial pain point is that you have to go and set up a Launchpad account. Right? And maybe you didn't have 1 before, and you already did have a GitHub. So, you know, there is that that burden to entry.
I don't think that we that that in itself has cost us much. I think more so the barrier to entry that has cost us was was from BCR, even though I argued that it shouldn't have.
[00:34:06] Unknown:
And how much of the overall contributions that you do receive are from some of the, other corporate users of Cloud in it such as the cloud providers like AWS, Azure, etcetera, versus individual contributors who are trying to fix a particular use case that they're having trouble with? I'd say the
[00:34:26] Unknown:
let's see. The the breadth of comp of contributions probably spans people that are looking to fix fix something or come across 1 problem and wanna fix it. You know, so the majority of let's see. The number of users, if you went by that, the majority of the people who contribute are probably looking to fix 1 thing or or, you know, just contribute a small thing. But the majority of the work, and at this point still the the vast majority of the work come has been paid for by Canonical, but there are definitely there are substantial contributions from both from operating system providers like SUSE or Red Hat or FreeBSD and and also by cloud providers such as Azure has got contributions in and Amazon has got contributions in. So I think the most active con contributor at this point outside of canonical, I I think, has to be actually actually Red Hat. The guys who are maintaining cloud and it's on on RHEL have been active in in participating and have been just really good done a good job of contributing in in a way that's been good for the overall project rather than just good for Red Hat. They've done a good job of adding unit tests where they need where they need them or where they add code and, you know, and not just looking to drop a fix for exactly their issue and and then run away. So I I really applaud their their work there. They've done good.
[00:35:55] Unknown:
And what are some of the most unexpected or creative uses of the Cloudynet system that you're aware of? Yeah. There's there's a couple,
[00:36:04] Unknown:
I think, that I've been real happy with. The first was that Canonical made these these Ubuntu images, and now other other operating systems make these images. I know Debbie does, and you can download a cloud image for CentOS or download 1 for SUSE or build your own for SUSE. The biggest boon I got was when I added the the no cloud platform or data source to where you can launch an instance, You can, you know, you can download this image and add a a second disk when you launch it that has some user data, metadata, and and then just launch that locally in in KVM in your KVM and interact with it. I think that's that's really useful for people to just if you just, like, if you just want to see Ubuntu server in our RFL and see how it is right now, honestly, the the easiest way for you to do that and cheapest might be to just to take that path, to launch an instance, to give it some user data that says set the password to password and boot it up and log in. And so that's been a a big help, and I think helped Ubuntu in general and and also Cloudnet. The the 1 that surprised me really was open compute has this idea of a disposable use OS, and I this isn't something I'm terribly familiar with, but 1 of the 1 of my coworkers, was involved in open compute and was kind of pushing this as, well, this should be cloud in it. And so I believe that on any open compute platform, you can kinda boot into something that that can be driven through cloud in it. And then, you know, on just through through its kind of bios or its its attached disks, it will go into that right away and that uses CloudInet. And I thought that was a huge win. That was really neat and something I was proud of. So
[00:37:56] Unknown:
In your experience, what's been the biggest stumbling block for new users of Cloudnet?
[00:38:01] Unknown:
Well, I think you you called out 1 of them, which is which is just, being able to to verify without launch, check, relaunch, the the user data that's going in. That's definitely been a big 1. And, also, I think that and documentation has really been lacking throughout the life cycle of the project that has been improving over time. We've got recently added some developers at Canonical to the project, and now I'm working we've got a larger team than was there before. And so just you know, things that unfortunately get shoved to the side due to other priorities, like documentation did over time, that those are able we've we've got some time to to work on those things. And so I think it's it's improving a lot and will continue to improve. And definitely, actually, it's it's been interesting that as we've we've gotten documentation written, we've gotten a lot of contributions from users who are just improving docs, which is really nice. I think, you know, read the docs went a long way to providing documentation for Cloud Nets in a place where people where Google would find it, and then those people and having that in the source tree and having people be able to contribute to it with the merge proposal has has really helped. So it's been great that we've had a lot of contributions that that just improved documentation, so that's been great.
[00:39:27] Unknown:
And have you been happy with the choice of Python up to date? And if you were to start the project all over again, do you think you'd make that same decision?
[00:39:35] Unknown:
Yeah. I I think that I've definitely been happy with with Python as a language choice. If if I had to do it if I had to start all over, you know, Go Go is a language, Golang that people are picking up to do a lot of these sorts of things. I guess my issues the the problems I've had with Python as a language choice are, the biggest the biggest thing for me was the Python 2 to Python 3 transition and the pain that that involved. I think, you know, anybody who's been active with byline development for any number of years really fine really has has been bit by that and bit by the fact that, you know, the the 2 things are are close enough, but not but are not not the same and will bite you. You know, there's lots of pointy, sharp objects at the at the end that will that'll catch you, and definitely I've been caught by my share of them. So the and the and the other thing that I've been struggling that that we've struggled with is kind of the the performance of Python primarily from, I think, startup cost. And and a lot of that might a lot of that is probably both can be improved by Cloud and its code and and also by reduction on the number of libraries we've used. There's throughout CloudNet's history, whenever there was something that we wanted to add, you know, a feature to add, you're you're kinda as a developer, you're given the option to either look for something that exists.
Often, you'd find that on you know, there was a Python module to do something, or you can rewrite, and you you then have the choice to either use that or or write your own. And, you know, there's reasons for both, but I think a lot of cases we've we've ended up with I've I ended up choosing to to not implement my own when in taken taken a Python project and it's a dependency that then has cost us in in startup time or in footprint dependencies and things. So I guess an example of that would be the Python pretty table. We just wanted to render something to the console that I IP information and choosing to do that with Python pretty table, I think, brought in a fairly large set of dependencies that affect startup costs. So the Python but the Python process just from the number of, stats and things that that occur. Right? The number of file loads and everything. So that's cost us in initial boot performance where if, you know, hindsight being 2020, we might have made different decisions. And I think that's 1 of the things that kinda Go is able to improve on with a single static executable or, you know, different different design points of Go, address those issues, but, you know, bring some of their own. So overall, I'm happy with Python. I think it's completely sufficient for what we needed to do, and and a lot of the problems that I've had are probably of my own doing. So I won't point the finger at anyone else without without what it's the saying, if you point your finger at somebody, there's 3 of them pointing back at you. So I don't mean to
[00:42:46] Unknown:
bad mouth any other project. Yep. Yeah. Every programming language is a foot gun of some caliber. Yes. Are there any notable features or improvements that you have planned for the future of Cloud Onet that you'd like to call out? Or do you think that it has largely reached a state of future completeness?
[00:43:02] Unknown:
You know, I think the the big things that we'll we'll get we'll continue to get improved network improved networking support. And that and that should that will, at some point, improve or include hot plugging of devices. Now if you add a NIC to a running system, then CloudNet will be able to query a data source query the cloud provider and say, hey, you just plugged in a NIC. How do you want your how do you want that configured? And then apply the networking config. So the end result to the user will be that there is you know, that when they connect to the NIC, it it got configured in the operating system, which is clearly what what the goal should be, but we're not there yet. The second thing is we'd like to be able to do the we'd like to be able to do similar things with disks.
You can on Amazon, you can attach a new volume or attach several new volumes, and I'd like to be able to declare to Cloud in it how you want those volumes configured. You know, if if you attach it if you attach a new volume, it's got data on it. Well, what? Did you want that mounted somewhere, or did you want that or what did you wanna do with that? If if it was an empty volume, you know, you maybe you want to rate it with the 3 other volumes that you just attached. So we'd like for Cloudnet to be able to take information that says what to do with those devices and and set up the operating system so that users don't have to do that sort of stuff themselves. Because similar to, similar to networking being painful, I mean, if you've ever tried to set up in a in a reliable way, disks and file systems across RAID or LVM or, you know, the there are lots of sharp pointy objects there too. And then I guess the the last notable feature is is really just the extension of the the hot plug ideas. Over time, we'll get more post boot actions and being able to respond to to the the cloud platform in in more intelligent ways and kind of much more do what the do what the user wanted or what the cloud platform wanted to happen.
So more magic happens, and the user cannot worry about, you know, configuration of the operating system, but can kinda instead worry about the things that they were trying to do, which is
[00:45:13] Unknown:
more important to them. Yeah. Life as a system administrator is full of sharp pointy edges that we're responsible for smoothing over so that the, end users of the platform can do the actual task that they were trying to do in the first place. Yes. And
[00:45:28] Unknown:
nobody sets out to nobody sets out with the end goal of well, only a few people set out with the end goal of setting up a a a raid that is reliable. You know? Everybody else just wants to use the block device or use the file system or run the, you know, run the web service that writes to the file system. So
[00:45:48] Unknown:
Okay. Well, for anybody who wants to follow the work that you're up to and see how you're progressing with Cloud in it, I'll have you add your preferred contact information to the show notes. Okay. And so with that, I'll move us on to the picks. And my pick this week is a system that I just set up for, using my emacs editor for managing my email. So I used the mu4e, library along with the mbsync tool from the isync project for synchronizing my email via imap and then managing it via emacs because Thunderbird was starting to chew up too much of my RAM. And so far, I've been pretty happy with this setup. So I'll also add a link to my dot files for anybody who wants to see how I configured it and, see if it's something that they're interested in trying out. So with that, I'll pass it to you. Scott, do you have any picks for us? I'll say,
[00:46:37] Unknown:
honestly, l LXD is a, is a neat piece of software. I know as a I know that it hasn't received a whole lot of, usage, for people outside of outside of the Ubuntu ecosystem, or at least I feel that that's the case. But it is a really well written piece of software that that can help you to accomplish a whole lot of things. Big difference between other container systems is just that it is a, system containers, which means primarily we run aspen init inside of there. So system d runs inside of that, and there are images published for it, and it it's really a good piece of software. I don't necessarily very often give such, credit to software as oftentimes like anybody else. You can see the warts, but,
[00:47:31] Unknown:
I am really happy with LXD and my usage of it. So I'd I'd suggest other people take a take a look at it. Google LXD and and play with it. Yeah. I've been looking at it as possibly a way to replace my use of Vagrant while I'm testing the work that I do with writing salt stack formulas so that I can have faster cycle times rather than waiting for VirtualBox to start up an instance and go through that whole process. So so definitely 1 that I've got my eye on as well. Yeah. And and actually that so it ties in on on Ubuntu.
[00:48:01] Unknown:
It it if you've got a ZFS partition I mean, anywhere where you've got a ZFS partition, but since Ubuntu 1604's got ZFS built in, if you initialize if you set it up with ZFS, I the launches and snapshots are are amazing. It's really, you know, really, really fast. If you don't have a ZFS file system, or I think it can do it with, LVM snapshots. But if you don't have that, then it will basically the launch of an instance requires a copy of the entire file system to another location. So that that's gonna be slow, but on ZFS, it's it's amazing. And you can even use it as a from a loopback file in ZFS, so it's you don't even have to think about it ahead of time and say, oh, look, I'm gonna part I'm gonna set up a ZFS partition and then do that. It'll just create the create it all for you. Alright. Well, I appreciate you taking the time out of your day to join me and share the story of Cloud in it and some of the ways that it's being used. I've definitely gained a lot of utility from it as a way
[00:49:08] Unknown:
of being able to get my cloud instances to a usable state and be able to work around some of the shortcomings of the other tools that I've got. So I appreciate your work on that, and I appreciate your time. And I hope you enjoy the rest of your day. Thank you. Thank you very much. You have a nice day.
Introduction and Sponsor Messages
Interview with Scott Moser
Scott Moser's Background and Introduction to Python
Overview of CloudInit
CloudInit's Evolution and Adoption
CloudInit vs. Configuration Management Tools
Architecting CloudInit for Compatibility
Challenges in Building and Maintaining CloudInit
Networking Complexities in CloudInit
Verifying CloudInit Configurations
Ensuring Stability and Security in CloudInit
Impact of Being a Canonical Project
Contributions from Corporate Users and Community
Unexpected Uses of CloudInit
Biggest Stumbling Blocks for New Users
Python as the Language Choice for CloudInit
Future Features and Improvements
Contact Information and Picks