Summary
If you write software then there’s a good probability that you have had to deal with installing dependencies, but did you stop to ask whether you’re installing what you think you are? My guest this week is Professor Justin Cappos from the Secure Systems Lab at New York University and he joined me to discuss his work on The Update Framework which was built to guarantee that you never install a compromised package in your systems.
Preface
- Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
- I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
- When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
- Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
- To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
- Your host as usual is Tobias Macey and today I’m interviewing Justin Cappos about The Update Framework, an open spec and reference implementation for mitigating attacks on software update systems.
Interview
- Introduction
- How did you first get introduced to Python?
- Please start by explaining what The Update Framework (TUF) is and the problem that you were trying to solve when you created it.
- How is TUF architected and what led you to choose Python for the reference implementation?
- TUF addresses the problem of ensuring that the packages that get installed are created by the right developers, but how do you properly establish trust in the first place?
- Why are consistent and auditable dependencies important for the security of a system and how does TUF help with that goal?
- What are some of the known attack vectors for a software update system and how do Python and other systems attempt to mitigate these vulnerabilities?
- One of the perennial problems with any dependency management system is that of transitive dependencies. How does TUF handle this extra complexity of ensuring that all of the secondary, tertiary, etc. dependencies are also properly pinned and trusted?
- For someone who wants to start using TUF what are the steps to get it set up with pip?
- How would a project that wants to use TUF, do so?
- Who is using TUF and when will it be used with PyPI?
Keep In Touch
- https://ssl.engineering.nyu.edu/?utm_source=rss&utm_medium=rss
- https://ssl.engineering.nyu.edu/personalpages/jcappos/?utm_source=rss&utm_medium=rss
Picks
- Tobias
- Justin
Links
- When the Going Gets Tough, Get TUF Going – PyCon 2016
- RPM
- Apt
- Stork Package Manager
- Yubikey
- Distribution Packages Considered Insecure
- Notary
- Flynn
- Uptane
- in-toto
The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to podcast.init, the podcast about Python and the people who make it great. I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. When you're ready to launch your next project, you'll need somewhere to deploy it. So you should check out linode@linode.com/podcastin it, and get a $20 credit to try out their fast and reliable Linux virtual servers for running your app or trying out something you hear about on the show. You can also visit the site at www.podcastinnit.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. To help other people find this show, you can leave a review on iTunes or Google Play Music and tell your friends and coworkers and share it on social media. Your host as usual is Tobias Macy. And today, I'm interviewing Justin Capos about the Update Framework, an open spec and reference implementation for mitigating attacks on software update systems. And, Justin, could you please introduce yourself? Sure. My name is Justin Capos, and I'm a professor at NYU in the Tandon School of Engineering.
And how did you first get introduced to Python?
[00:01:09] Unknown:
I was a graduate student, and I had been working for a while to try to build a package manager. This was actually in the very early days of, what we now call cloud computing. We kind of v servers and other kinds of virtualized environments like this had just come out. And I had this this big piece of spaghetti code that I had written in c, and it was it was messy, and it it relied on RPM, but it it did, you know, it basically let you install packages in these, you know, in these virtualized, distributed environments in a way that saves space and had interesting security concerns, like, interesting security issues handled correctly. And my adviser, John Hartman, he said, you know, I came into a meeting 1 day and was talking to him about all the problems I was having with the implementation, the problems I was having with kind of using RPM as a tool with my c code, and he said, well, I heard about this thing Python and you should give it a try. And I thought, okay, well, you know, he just heard about something. He just wants me to try it out, but I guess I can I can check it out? And I sat on, I think, a Friday and went through the tutorial, and I just really loved the language. I kinda fell in love with it right away. And so for the week and a half or so after that, I went and I actually rewrote this package manager that I've been working on and included even all the functionality that I have been relying on RPM for. And this was, you know, something that that really made me love the language because it was so expressive and easy to use and things. So, you know, as I said, this was really the first kind of cloud package manager, the first package manager that was specifically designed to take advantage of some of the things that the cloud, gave you. And, it was something that I was able to to do what had been months of work and not do as well in in a short period of time. So and, of course, personally, it was very successful for me because this ended up being my the the topic of my dissertation was this package manager.
[00:03:03] Unknown:
Yeah. Package management and dependency resolution is a challenging problem that I tackle on a fairly daily basis given that I work in the operations space. So I'm working fairly closely with things like apt and yum and the, various language dependency managers. So it's interesting to hear about some of the various challenges and some of the, sort of theoretical aspects that that go into them as well.
[00:03:26] Unknown:
Yeah. There's a lot more than I think people expect there. We took a look at their security early on. Really, I guess the first piece of security work I did was I had made all these design choices and and decisions and things when I worked on, you know, my package manager store. After I had gone and done so and I needed to kind of articulate clearly how different, you know, in which ways it was different from existing things, I really started to look very closely at their security. And when I did this, I was actually quite surprised that a lot of the things I've been quite concerned about and spent a lot of time, working on were things that the the popular package managers hadn't, had just decided not to address at all.
[00:04:05] Unknown:
Yeah. And, I guess to get into that a bit, a lot of people, at least in app and RPM, when they go to install a package, a lot of times you might see something about the GPG signatures for a particular package. And oftentimes, people conflate the fact that GPG is involved in the overall process with the fact that there's actually some security afforded by that. So I'm wondering if you can touch briefly on why that is not necessarily the case and some of the ways that can mislead people.
[00:04:35] Unknown:
Sure. So 1 of the problems that you have is when you trust a key for a package manager, you trust it completely. And so let's say you wanna go and you wanna download, a package from, let's say, the Django developers or something, and you add the the Django developers key to your to your key chain, then that key will be trusted to validate any of the software you have. There's no kind of control about what keys are trusted in what way. And that is a is is just a huge problem because in a lot of cases, the set of keys that you'll you'll go and you'll you'll have on on a a system, you know, maybe a 100 or a couple 100 keys. So if something goes wrong with any of these, your security is gone. But what's kind of interesting is that conversely, if I go and I look and I see that everything signed with 1 key, that also tends to make me much more nervous because what tends to happen, and and this has happened for some projects, is that they keep this key then in, like, an online server, like a build server or even on the repository itself in some cases. And then what will happen is is that an attacker will go and break in to that server.
And once they've broken into that server, they can do anything they want, like, sign custom versions of, security critical packages. And this is this is something that's that's happened with quite a few, different distributions in quite a few different scenarios. Fedora, Debian, and others have have had this happen to them.
[00:06:04] Unknown:
And so that leads us a bit into what you are working on with the update framework. So could you explain a bit about what that is? And heretofore, we'll probably refer to it primarily as tough t u f for the update framework and, also what the problem was that you were trying to solve when you first got involved with creating it. Sure. So the real problem that Tuf tries to solve that store and, like, really, most most software updaters and package managers don't even try to address
[00:06:33] Unknown:
is we wanna make it so that even if a an attacker goes and breaks into a server or steals a signing key, that this isn't just a kind of fatal, game over event where they now can can root and compromise all your users. We want the system to tolerate key compromise in a way that means that, you know, that you have a secure way to recover from it in most situations. And the general way that that tough does this is through a a a couple of different techniques. 1 is the idea of having something called a roll. And a roll, means that a key can be used for some purposes, but not for other purposes.
So you might have a key that is responsible just for telling you, has there been an update? You know, what is the what is the hash of the repository at the last time that there was an update? And it's it's just responsible for letting you know, is there an update? Is there an update? Whereas, this would be a different key than the key you would use to actually sign a a package or a piece of software and say, this is a valid copy of this specific software. And the ways, so so that's that's 1 idea is this concept of roles. Another concept that we have inside of TUF is this ability for roles and and keys that are used, for very sensitive things, like, you know, being the root of trust in the system or signing for packages themselves. We've designed Tufts so that those keys only get used in very specialized circumstances, and so it's easy, from an operation standpoint to keep sensitive keys offline. And in fact, some you know, people that use tough a lot of times, certainly for the root keys, those are basically always kept offline. What I mean by that is, you know, it might be something that's on, like, a, you know, a like a YubiKey device or something like that that you keep in a in a locked drawer or a safety deposit box or something like that. But but even for things like, like signing packages, then you only need to use this key when you're gonna do a new release. And so people will often have, like, mini ceremonies around, going and and, using those keys. And 1 other aspect I I wanna to touch on is Tuff also lets you use thresholds of keys for different operations so that you can go and you can actually do things like say, well, I have this project and 2 of my project members, 2 of my my 4 developers have to come together and say this is a good release. Or in the case of something like the root key, you know, like, 5 of these 10 people have to come together and use their root keys in order to sign a new root file. And this gives you a lot of resilience to, compromise because even if an attacker steals, you know, a developer's laptop and they left that YubiKey in it or something like that, it's not necessarily a fatal
[00:09:23] Unknown:
event. And is the key threshold something that is determined just at a protocol level within your implementation of Tuf in that it requires a certain number of known keys? Or is it using the key sharding approach, similar to what I know, HashiCorp's vault system does where it actually generates a key and then shards that key into multiple pieces that each, each of the developers would then have 1 portion of.
[00:09:48] Unknown:
So it it could be done either way within our system. We do support within our metadata format, which is what you know, something that you set up when you set up your repository. You effectively say what these thresholds should be and, you know, this key you know, these pick 2 of these 3 keys or whatever else. But if you wanted to do something more transparent with, sharding and stuff, you can do that as well.
[00:10:10] Unknown:
So you mentioned the metadata for a given repository. So I'm wondering if you can talk about that some more and also how tough is architected overall and also what led you to choose Python for the reference implementation.
[00:10:25] Unknown:
Sure. So, when you set up a repository, you go and specify, sort of how trust flows within the within the system. What you do is you set up a a root file. The root file serves as the root of trust, and it has root keys. Usually, a threshold of them go and do signing. And then, you know, that that sets the keys for the other high level roles in the system, for the other top level roles in the system. And those other roles fulfill different tasks, like the time stamp, key. He tells you when the last time that you've updated the the system is. The consistent snapshot role tells you basically, like, that there's been a consistent set of files together on the repository, and what that consistent set of files looks like so that somebody can't pick weird old combinations of files and try to tell you that that they were all current on the repo at the same time. And then you have, a targets rule. And the target rule is actually the thing that lets you go and point to the specific files that you want to to download and talk about, like, what keys should be used to assign trust to those. And the way that that's typically used is the top level targets rule will go and then delegate and say, for instance, the the Django project's, keys that should be trusted to sign any package that starts with Django are this. Or, you know, this is this is their information about their keys. And then under that, they'll create a metadata file that will go and then say, oh, and within our project, we have these 3 members. And, Alice is the only 1 who does packaging for Windows, so she so every Windows thing should only be signed by Alice. And Bob and Charlie come together to do packaging for Mac and Linux, and so they should work, you know, together and and, you know, and and coordinate things this way. And and the nice thing about having this kind of metadata flow is it's you know, the different projects get to internally manage their personnel. They get to internally manage, how they wanna handle their keys and do stuff like this. And a big repository that goes and serves this data and, you know, manages metadata. It only has to manage things at the level of, you know, this project has, some key assigned to it, but it doesn't have to directly manipulate or do anything with that project's metadata because the project owners maintain it themselves. 1 1 thing I'd like to also say about this is is that if this sounds scary and hard and complicated, it's it's often harder to talk about something like, oh, you have a, you know, you have a file and you list people's keys in it and you say what they're supposed to to sign. But if you look at an example, it's actually, like, 1 of these things that, you know, when I give a talk about it and I I try to show it, like, with diagrams and a whiteboard and things, people say it sounds complicated, but then you show them the file format, and you're like, wow. Okay. I can totally see how we do this. It's, it's trivial. I just put my key here right, and then this is all I need to do. So if if you're confused about any of this, then I encourage you to to go to the update framework, dotcom and take a look at the, and and take a look at the file formats and things there. Okay. And to answer your so why do we use Python?
So the main thing we wanted to accomplish with our reference implementation was for it to be a readable, implementation, a readable clean implementation for other people to refer to. And, you know, in my view, Python is a language. It's very widely known. It's very easy for people who go and, you know, and and wanna pick it up and and take a look at it to to do so. And to me, it reads a lot like pseudo code. So for us, it was kind of a no brainer to use Python, and it does seem to be readable. We have, I think, we have 5 different implementations of TUF that have been done by outside groups. And, so it is apparently, readable enough that people could look at our code and our spec and things and figure it out.
[00:14:08] Unknown:
And the actual implementation of the specification, is that largely just for consuming the metadata files and ensuring that the overall workflow from the creation of the software or the package to the delivery is properly managed?
[00:14:24] Unknown:
You know, TUF itself provides that role that you described where the purpose is to go and ensure that software wasn't tampered with after packaging. And the reference implementation just shows an example about how the system is supposed to work because there are a whole bunch of very subtle corner cases that people often, have gotten wrong, historically that has led to insecurity. And so a big part of what we're trying to do with TUF is to show exactly how to you know, that that we think this is the the right way to do these things and to really open it for to the open source community, to the security community, to others. You know, if you find a problem, definitely tell us. We'd love to have people poke at us and and, point out issues because there's a lot of really subtle things that that happen when you deal with trying to make a, you know, a system like this that's resilient to failures and is usable and is reasonably efficient and and so on.
[00:15:19] Unknown:
The key signing and the chain of trust that's established by having the multiple keys for the different roles is useful for ensuring that the version of a package that you're retrieving is 1 that is authorized by the people who created it. But, it doesn't really address the problem of how do you establish a proper amount of trust in the original package in the first place without having that chain of trust already present. So what are some of the mechanisms that people can use to be able to actually audit packages to ensure that they are trustworthy when they first start depending on tough to check that chain of trust going forward?
[00:15:56] Unknown:
Sure. So it's probably easiest for us to talk about this in the context of, what but what's in the PEPs, the Python enhancement proposals that we, did actually a few years ago with PIPI, because this will kinda show how they plan to use it. But the basic way that it it works is is that, you know, there are targets files, and these these targets files are files that are kept and managed on the repository. And they go and they have signed things that says, this is Django's key. This is the key for beautiful soup. This is the key for, for this project, for that project. And and then what happens is that when you go and you just initially set up something like PIP, you get the the root metadata that gets shipped with your package manager. And that root metadata tells you things. Like, it tells you the initial, targets key that that's on PyPI. And so from that, when you go and you retrieve those files from from PyPI, you can go and you can get this this validated, list of older, packages, like older projects that have been around. So pretty much anything that is, you know, give or take, you know, little, like, 2 weeks old or so is is gonna be signed with a with a role that has an offline key in the case of PyPI. So even if you go and you control the repository itself, the only thing you would be able to tamper with and the only thing you'd be able to tamper with, this would be temporary, would be very, very new projects that had just been registered in the last, couple weeks on, PyPI.
[00:17:27] Unknown:
And for somebody who does manage to compromise 1 of the keys, what are some of the potential attack vectors that, developers are exposed to if they don't have an auditable deployment system for their dependencies?
[00:17:41] Unknown:
Yeah. So if if you're not using something like, TUF, then you're in a situation similar to what's happened with Ruby Gems and a whole lot of other package managers and community repositories like this. And, really, an attacker who breaks in and gets access to keys can do whatever they want with your users. They can sign updates that say anything they like. So, yeah, so tough is is really important unless you assume that, oh, we have perfect security. It's just not possible anybody's gonna ever break in and and, you know, or ever be able to get a signing key or ever find a weakness in the algorithm that we use to, generate our signing key as Debian had or, you know, be able to get into our CA chain and, and create certificates that that, validate as correct as happened with Microsoft. You know, there's there's, historically, just problems over and over and over again when people don't design these systems to have the potential for revoking a key, and that is really essential here is is that tough is designed to make it so that you can securely revoke a key.
[00:18:43] Unknown:
Yeah. And also, there was an interesting article that came out recently where somebody was doing some research on the potentiality of launching an attack at install time of a package in some of the more popular programming language package managers such as Ruby Gems, PIP, and NPM. And a lot of them have a hook during the package installation process that will allow you to execute a script with the a sensible purpose being to use it for placing the bundled files at the appropriate locations, but it also is a security hole in that there's not necessarily any restriction in what that installation script can do or access. So I'll add a link to that in the show notes as well for people who are interested.
[00:19:24] Unknown:
Yeah. It's it's really frightening. The capabilities that you have you know, basically, if you get to the point where you can install software, on their system in any way, then that that's almost the, you know, the kind of idealized goal for attackers in any situation. Because if you really think about it, what are you trying to do if you're, exploiting buffer overflows? Are you doing, you know, ROP? Are you doing all these other attacks? You're really just trying to be able to run software on a user system. And what is nice about attacking a package manager is it's really hard to defend against, because it usually opens a connection outgoing that's often an encrypted connection and downloads something over it. So it's going somewhere where it's supposed to go, and so no firewall or anything is gonna stop that. It's running oftentimes as root, at least with certain types of package managers, After Yum, certainly does this frequently, although you do it a lot less frequently with something like PIP. And, you know, the no protection you're gonna put on your system, you know, ASLR, whatever, none of that stuff is is gonna stop it because it's doing exactly what it was designed to do, which is put new code on your system to tell it to do new things. So in in many ways, it's it's just the ideal vector if an attacker can make it work. I think that I'm starting to understand a bit, from what you were describing about the PEP proposal
[00:20:44] Unknown:
for integration with PIPI. But 1 of the perennial problems with dependency management and package installation systems is the transitive dependencies that are pulled in by the primary package that you're trying to install where you may have established trust with a given version of that 1 package, but how do you necessarily ensure that all of the packages that it's pulling in are trusted by the people who created the library that you're trying to install in the first place and, you know, establishing a proper amount of trust. But if from what you're saying, there is that root key that is being used to validate everything that is that that's being pulled in by your, primary dependency, then that may go some way at least towards ensuring that you're not downloading compromised versions of those packages.
[00:21:30] Unknown:
Yeah. That's a great way to put it because what ends up happening is is that the trust you have is restricted to the packages that you actually need and are actually installing in in your system. So if you install some package and it has a dependency on something, then you do end up transitively trusting or at least going to the repository to download the, you know, through its root of trust to get to that package. But a nice thing about, our system that you wouldn't have in, like, a GPG based system or other things is is that even if let's say somebody goes and they break in and they get the keys for the Beautiful Soup project, that doesn't actually help them attack users of Django or attack users of other software on the system. It's only Beautiful Soup and and, you know, users who end up installing it that that are at risk.
[00:22:19] Unknown:
So for somebody who wants to integrate the TUF specification with their package management system, what are the steps necessary to achieve that integration?
[00:22:29] Unknown:
So 1 of the things that we've tried to do is make it as easy as possible for people to integrate. 1 way to you know, 1 of our goals with TUF was to really make it kind of an almost like a drop in library that you could use in a lot of different context. So our integration with PIP only took a few minutes to do, and I think we have, I I forget how many lines it was, but it's I I don't believe it's more than 10 lines of code we we changed in order to at least get it, in there and working by default in the system. So you should pretty much on the client side, you just kinda plug it into your software updater, and it will parse and do the correct things with all the metadata that it's been provided. If you on the server side, you set it up on your repository and you just set things up so that, you know, you do 1 time setup where you set things up to be signed in the correct way by the appropriate rules. And then now when you actually go to produce a package or do something like this, then you'll actually use your key, to go and and do these signatures. So it really doesn't change the workflow that people do on either the client or the server side. It just makes them use on the server side, you use keys in places where you maybe should have been using keys, but you weren't really using keys. And on the client side, your clients never they don't know that they have Tuf installed in their software updater unless there's an attack.
And then TUF goes and and tells them, hey. This you know, something's really fishy is going on here and and helps to keep them safe.
[00:23:55] Unknown:
And for the metadata files that are included with the projects, so for example, with a, Python wheel or Python egg that gets uploaded to PIPI, when somebody does go to install it using pip, I guess if you could just describe a bit about the installation process that's involved there that ensures that the metadata is parsed before any of the actual install hooks are executed so that you don't accidentally execute some potentially compromised code.
[00:24:21] Unknown:
Sure. So anytime you're you're going and you're downloading, and installing something with a package manager or a software updater that uses TUF, what it'll do is it'll always check the validity of the target, you know, whether this is a package or just like a normal zip file or tarball or wheel or whatever. It'll check that validity before it goes and provides that to the software updater or package manager. So, in fact, TUF will do the download for you know, on its behalf because there's actually some attacks that relate to, you know, how you get information from the network and what you do with that information. Both Apt and and Yum, for instance, had problems that I found back in, 2, 007, along with a student of mine, Justin Samuel.
We found that if you just fed them an endless stream of bytes, that they would both crash and have problems. Apps would would fill all memory on the system, and, of course, you know, this would this would cause your your server, whatever it was, to kind of lock up and and stop responding very well. And on Yum, it would actually fill the disk and then Yum would crash and it wouldn't even print out an error message or anything like that. So even simple attacks like that, you know, our goal with TUF is to be this really comprehensive, you know, thing that you just kinda drop in, and then you don't really have to worry about the security of your updater because we've really taken care of it for you.
[00:25:44] Unknown:
And what are some of the projects that are already using TUF in their dependency management systems?
[00:25:51] Unknown:
Sure. So it's, it's used in a a few different projects, now. It's used by Docker. So if you go and you install things through Docker Hub, then you're using something called Notary, which it integrates top, and it's also used in, Flynn and inside of Leap. So we've had some uptake in kind of the container management space, and we've also had some in more recently, we've been working with the automotive industry, and we've just reached the point where we have what's effectively an automotive version of TUF called Optane that you can actually now go and buy from an automotive vendor, like a supplier, the the people who make parts for the for the OEMs. And we have about 5 other vendors that are currently in the in the process of integrating and implementing Optane into some of their products. And so, we're hopeful that, looking ahead that, we'll be able to solve this problem in a lot of, different contexts.
[00:26:51] Unknown:
Yeah. Given the increased usage of software systems and infotainment systems in modern automobiles. I can definitely see how having something like tough put in place would be immensely valuable, especially in light of a lot of the attacks that have had successful reference implementations done. For instance, the Jeep system where the, attackers were able to completely lock up the brakes and take full control the vehicle while it was on the highway and, other things like that where as our vehicles become more connected, they become increasingly more vulnerable to people who are, not necessarily acting in the best interests of the people behind the wheel.
[00:27:29] Unknown:
Yes. It's a very scary situation now because of the lack of security that exists in many of these, in in many of these devices, in many of these environments. 1 thing I will say that's, I think, very positive about the automotive industry is that they have really started to to really try to take notice and do what they can to to secure these systems. And so we've been working with them a lot on the subtane project. We've been working with I you know, I I I'm not supposed to name names of individual companies and things, but we've had a participation from, you know, 30 or so, different organizations when we've had these meetings, including major OEMs, major suppliers, even folks from, like, government agencies that do regulatory things around, vehicles and stuff like this, and and, they've been very receptive so far. And I'd like to encourage anybody that is potentially interested in helping vehicular security and, to go and take a look at our specification, for Optane, u p t a n e. And, if you find any things that you think are security problems with it or any other limitations or issues, we'd love to have your feedback because just like with things like TUF, we we really, you know, appreciate and and get a lot of value from the security review that people in the community have given to us.
Yeah. It's always great when something like this is able to cross over from a academic research project into something that is immensely practical in industry as well. So it's, definitely great to hear about this. Yes. I'm I'm very focused on that and really feel that, you know, if you're if you're gonna do if you're in academia, you should be trying to do things that are gonna be for the better of humanity. Yeah. So within my lab, we're really focused on trying to have practical impacts across a whole bunch of different domains. So, really, anything we build, we try to deploy in practice in in large software projects, whenever and wherever we can. We you know, recently, we fixed some design flaws in the way Git was doing signing, and, you know, we've been active in in trying to to improve security and fix bugs in in Python and, you know, a lot of other context in a apart from the the top thing. So, you know, if anyone listening is is, potentially interested in, you know, coming and and, you know, working with us, I'd I'd love to hear from you.
[00:29:54] Unknown:
Absolutely. I'll have you send me the preferred contact information for anybody who does wanna follow-up with you for any of that. So Tuf is great for once you've already got a package built and you're working on distributing it, and you wanna ensure that chain of trust in the distribution and updating of the final product. But how what are some of the ways that you can mitigate some of the other attack factors in the initial creation and development of a project before it gets to packaging?
[00:30:20] Unknown:
Yeah. So 1 of the things that you'll hear a lot about is reproducible pills. This gets promoted a lot. Mike Perry with the Tor Project had really did a a great job of bringing this to people's attention, and now you see a lot of major, like, Linux distributions like Debian, for instance, that are going. And, you can the vast, majority of their packages do have reproducible builds working for them. But that's really only 1 part of the the puzzle here. You also have to have a bunch of other things that, all kinda go your way and work correctly. Like, you have to have your do signing within your version control systems and make sure that the things that you sign and the way that you sign them actually give you meaningful security properties, which, unfortunately isn't isn't as straightforward as as, you might think. And then attackers from have also gone and tampered with things in the middle, you know, throughout the process as software moves, you know, from version control system through test processes, into build systems, to, you know, continuous integration systems, and so on. So 1 of the things that we're we're working on now is to try to have, like, a kinda whole system all the way from the moment the the code is written by the developer, all the way through the VCS code review, all the style checking, you know, all the rest of the process up to the point when it reaches the user. We wanna do verification and validation to ensure that they're the software you're getting is authentic as what your developers meant to build and so on. And we're still kind of in a little bit of stealth mode about this, but I will tease it for your viewers.
The project's name is is, in, toto. And if you're interested in finding out more about it, then, you know, you can check the link on my website and click on my projects and and see things about it there. And we'd be happy to provide anybody with more information when we, publicly release.
[00:32:10] Unknown:
Yeah. That definitely sounds like an interesting approach and something that's definitely needed, particularly as software systems continue to be increasingly critical in everybody's day to day lives. So are there any other subjects that we should touch on? I don't think so. There's nothing I can think of. Are you looking for any particular types of contributions from people for the tough framework or the specification?
[00:32:35] Unknown:
You know, the honestly, the things that would be the most useful is getting people to say, why isn't TUF in warehouse? If they wanna do kind of, like, you know, I I know it's on it's on people's radars, but more people who kind of ask that question, I think the the bigger the data is on the radar. The other things, you know, if people point out, you know, there are security issues that people see, we haven't really had very many of them, although we've had a lot of reviews. But if you if someone does find a security issue, especially in Optane, which is, I guess, newer and and has more has changes to Tufts that would make it perhaps more likely that they would find something. Of course, we'd love to have that. We're always encouraging people who are interested in using it in their domain to go ahead and integrate it. You know, you don't have to use our implementation. You're welcome to, but other people have, pretty quickly been able to whip up their own tough compliant implementation. So we'd be happy to, work with you or discuss, any issues you might run into with that as well.
[00:33:34] Unknown:
And for people who are building their own implementations, is there some form of test suite to ensure that it's properly compliant with the specification?
[00:33:43] Unknown:
Yes. You should be able to go and take and run pretty much all the or many at least of the tough tests, because they really deal with manipulating metadata files and things of that sort. So you should, for the most part, be able to go and do it as long as you, use the same, like, JSON formats and other things that we did.
[00:34:04] Unknown:
And given the fact that Tuf is not yet integrated into PIPI or the warehouse project, is it possible for people to start using it in a PyPI
[00:34:14] Unknown:
mirror? Oh, yes. You certainly could do this. You could even start to sign tough metadata for your projects today. There's certainly nothing that would stop you from doing it. It's just that, unfortunately, currently, PyPI isn't going to be going and sort of providing the key information, you know, the information about what your project key is, to your users automatically, so that your instance of PIP or whatever that would consume this information would validate it. So you can certainly add the metadata now. It's just unfortunate that it won't be able to be to be used yet.
[00:34:48] Unknown:
And do you know if there are any plans on incorporating that into the Conda system as well?
[00:34:54] Unknown:
Yeah. There we've had some really great discussions with the Conda folks. We've had actually great discussions with a lot of different folks, you know, with folks in the Ruby community. We've been working a lot recently with the core OS folks, looking at Haskell and OPAM and and, just a wide variety of different communities that are that are interested in this. So if you're a member of 1 of these communities or, and they're able to, you know, give a little bit of time, then you can perhaps help to make a a big difference by helping them to be secure so that if there is an incident, it doesn't impact their users in a negative way. Alright. Well, with that, I will move us into the picks. Like I said, I'll have you send your preferred contact information for anybody who wants to follow you and keep get in touch, or if they want to get involved with the work that your lab is doing. And so for my first pick today, I'm going to choose the Enchanted Forest Chronicles,
[00:35:46] Unknown:
which are a series of 4 books that are sort of targeting young adults that I recently reread with my son, and they're just very humorously put together. They are very tongue in cheek with a lot of sort of the fairy tale and fantasy icons. And, there's there's a lot of good humor in them. There's a they're a fun story line, and so I definitely recommend checking them out for anybody of any age, really. And with that, I'll pass it to you. Do you have any picks for us today, Justin?
[00:36:11] Unknown:
I'd encourage you if you haven't ever had hand pulled noodles, which is a interesting, Chinese thing where they they, you know, they hand roll out the noodles and they make them into your soup, I'd encourage you to try it. If you're in New York, then there's Lamzhou, which is quite a good place in Manhattan's Chinatown. But you can really
[00:36:29] Unknown:
anywhere that makes them, I I've never had a bowl of them and been disappointed. Alright. Well, I really appreciate you taking the time out of your day to tell me more about the update framework. It's definitely interesting work and important work, and I look forward to seeing it more tightly integrated in all of the different software systems that I use. So I appreciate it, and, I'm sure the listeners will enjoy hearing more about it as well. Thank you very much. I appreciate it.
Introduction and Guest Introduction
Justin Capos' Introduction to Python
Challenges in Package Management
The Update Framework (TUF) Overview
TUF's Key Management and Security Features
TUF Integration with PyPI
Ensuring Metadata Integrity
Projects Using TUF
Mitigating Attack Vectors in Software Development
Future Integrations and Community Involvement
Picks and Recommendations